Introduction to Hypothesis Testing

**QUESTION**

INTRODUCTION TO HYPOTHESIS TESTING

Assignment Overview

Suppose that a 2012 National Health Interview Survey gives the number of adults in the United States which gives the number of adults in the United States (reported in thousands) classified by their age group, and whether or not respondents have ever been tested for HIV. Here are the data:

Age Group Tested Never Tested

18–44 years 50,080 56,405

45–64 years 23,768 48,537

65–74 years 2,694 15,162

75 years and older 1,247 14,663

Total 77,789 134,767

Discuss probability. What is its history? What is the theory of probability? How is it calculated? What are the advantages and disadvantages of using this technique?

1. Identify and discuss the two major categories of probability interpretations, whose adherents possess conflicting views about the fundamental nature of probability.

2. Based on this survey, what is the probability that a randomly selected American adult has never been tested? Show your work. Hint: using the data in the two total rows, this would be calculated as p (NT) /( p (NT) + p (T)), where p is probability.

3. What proportion of 18- to 44-year-old Americans have never been tested for HIV? Hint: using the values in the 18–44 cells, this would be calculated as p (NT) / ( p (NT) + p (T)), where p is probability. Show your work.

Submit your (2-3 pages) paper by the end of this module.

Module Overview

In this module, we shift gears from descriptive statistics to inferential statistics. Inferential statistics are used to determine the probability that a conclusion based on analysis of data from a sample is true (Norman & Streiner, 2008). As statisticians, we keep in mind that when gathering data on a sample of people there is a possibility for random error. In other words, measurements drawn at random from a population of individuals of interest will differ by some amount as a result of random processes.

We start by formulating a null hypothesis. A null hypothesis is an assumption that there is no significant difference between a sample mean and a population mean. We then formulate an alternate hypothesis that is mutually exclusive.

The primary goal of a statistical test is to determine whether an observed data set is sufficiently different from what we would expect under the null hypothesis that we should reject the null hypothesis.

A Health Scientist may carry out an experiment to attempt to test a particular null hypothesis, so that it cannot be rejected unless the evidence against it is sufficiently strong.

For example,

Ho: there is no difference in likelihood of heart attack between patients who took Medication A compared to those who took Medication B

H1: there is a difference in likelihood of heart attack between patients who took Medication A compared to those who took Medication B

One of the most important concepts to grasp in this course is the term “Significance”. Significant (in the statistical sense) means the likelihood of a particular result is probably not due to chance.

In the example above, we estimate the probability of getting the observed data assuming that the null hypothesis is true. One useful statistic commonly used across disciplines is the p-value.

The p-value may be defined as the probability of getting the observed result, or one more extreme, given that the null hypothesis is true. Researchers commonly choose in advance (i.e. a priori) a probability of less than 5% as their criterion for statistical significance. So in other words, a test result reported as p<.05 means that the likelihood of obtaining that result due to chance alone is less than .05.
Remember, we assume that the null hypothesis is true and then perform a statistical test of comparison as basis for our decision about whether to reject. There are one sided tests and two sided tests.
H0: μ1 = μ2
HA: μ1 ≠ μ2 (Two-sided test)
H0: μ1 = μ2
HA: μ1 > μ2 (One-sided test)

In the next module we will actually perform inferential statistical tests to compare sample means to population means.

Sources:

Norman, G., and Streiner, D. (2008). Biostatistics the bare essentials (3rd ed.). BC Decker Inc. PMPH USA, Ltd. Shelton, CT. eISBN: 9781607950585 pISBN: 9781550093476.

Required Reading

Module 3 – Background

INTRODUCTION TO HYPOTHESIS TESTING

Required Reading and Resources

Cook, A., Netuveli, G., & Sheikh, A. (2006). Chapter 4: Statistical inference. In Basic skills in statistics: A guide for healthcare professionals (pp. 40-52). London, GBR: Class Publishing. eISBN: 9781859591291.

Norman, G. R., & Streiner, D. L. (2014). Section the first: The nature of data and statistics: Chapter 6: Elements of statistical inference. In Biostatistics: The bare essentials [4th ed., e-Book]. Shelton, Connecticut: PMPH-USA, Ltd. eISBN-13: 978-1-60795-279-4.

Stattrek. (2020). What is Hypothesis testing? Teach yourself statistics. https://stattrek.com/hypothesis-test/hypothesis-testing.aspx

Additional Reading and Resources (Optional)

Castagnoli, E., Cigola, M., & Peccati, L. (2017). Probability: A brief introduction. Chicago: Bocconi University Press. Available via EBSCOHOST.

Khan Academy. (2020). P-values and significance tests. [Video file]. Retrieved from https://www.khanacademy.org/math/ap-statistics/tests-significance-ap/idea-significance-tests/v/p-values-and-significance-tests

Mathtutor. (2014, August 20). Null and Alternate Hypothesis – Statistical Hypothesis Testing – Statistics Course. [Video file]. Retrieved from https://www.youtube.com/watch?v=_Qlxt0HmuOo

McDonald, J. H. (2014). Basic concepts of hypothesis testing. Retrieved from http://www.biostathandbook.com/hypothesistesting.html

Stensson, E. (2012, Apr.) Basic statistics tutorial 45 hypothesis testing (one-sided), sample and population mean (z) . Retrieved from http://www.youtube.com/watch?v=IKxyXs6kRTo

Additional Resources

Purdue Online Writing Lab. (2019, October). General format. Retrieved from https://owl.purdue.edu/owl/research_and_citation/apa_style/apa_formatting_and_style_guide/general_format.html

Purdue Online Writing Lab. (2019, October). In-text citations: The basics. Retrieved from https://owl.purdue.edu/owl/research_and_citation/apa_style/apa_formatting_and_style_guide/in_text_citations_the_basics.html

Purdue Online Writing Lab. (2019, October). Reference list: Basic rules. Retrieved from https://owl.purdue.edu/owl/research_and_citation/apa_style/apa_formatting_and_style_guide/reference_list_basic_rules.html

Introduction to Hypothesis Testing

**ANSWER**

Introduction to Hypothesis Testing

Student’s Name

Professor’s name

Institution Affiliation

Course Name: Course Code

Date

Probability is the chance or likelihood of an occurrence or event happening. In the 1650’s gambling was fashionable and popular among the French society. In 1654, probability originated as a result of a gambler’s dispute between two players, concerning how stake would be divided between them, after their game was interrupted before it could end. The problem was suggested by a popular gambler Chevalier de Mere to eminent mathematicians, among them Blaise Pascal who exchanged his thoughts and ideas with Pierre de Fermat (Todhunter & Isaac, 2016).

As a result of correspondence between these two prominent mathematicians, modern concepts of probability were developed. These two mathematicians developed a method which is now known as the classical approach to probabilities computation. The classical approach methods state that supposing a game has n number of outcomes that are equally likely, of which m number of outcomes are correspondent to winning, then the probability of winning can be calculated as m divided by n, that is, m/n (Todhunter & Isaac, 2016).

Theory of probability is a subtitle in mathematics that entails analyzing of random phenomena. Using existing probabilities of certain random events, this concept allows one to find the probabilities of several other random events that have some connection to the first events. The probability theory assumes the concept in an intense mathematical style whereby expresses it via a combination of axioms.

Ideally, probability is formalized by these axioms by means of a probability space, whereby this probability space then assigns a measurement which takes values ranging from 0 to 1, known as probability measure, associated to the combination of outcomes known as a sample space. In case of a particular subset of those outcomes, it is referred to as an event (Venkatesh, 2013).

Calculating probability of an outcome basically requires using a simple formula and incorporating division and multiplication to evaluate the possibility of outcomes of events. You can implement the following steps in order to calculate probability; first step is determining a single event that has a single outcome, next step is identifying the total number of outcomes that are likely to occur, and finally the last step towards finding probability is dividing the total number of events by the total number of expected possible outcomes. (Shiryaev & Albertn, 2019).

The advantage of applying this technique, is that it is simple to use. However, this method requires that the possible outcomes should be broken down into outcomes that have equal chance of occurrence. Unfortunately, it is not always possible to do this, maybe because it is always not clear when to have outcomes that are likely to happen equally. Another disadvantage is that this method does not help to find the probability when there is more than one occurrence (Shiryaev & Albertn, 2019).

There are two main probability interpretations categories, and their adherents own conflicting views concerning the fundamental nature of probability. The first interpretation is known as Epistemological interpretations. With regard to this interpretation, probability is mainly related to human belief or knowledge. This concept is meant to objectively measure support relations that are evident. For instance, taking into consideration the current data on daily covid-19 cases after cessation of some Covid-19 restrictions like mandatory quarantine, United States of America will probably experience a second wave of Coronavirus cases. (Khrennikov, 2020).

The second interpretation of probability is objective interpretation. With regard to this interpretation, probability is all about a featuring reality that is not dependent on the human belief or their knowledge. In some instances, reality is assumed to be actually the physical world, and sometimes it is assumed to involve a kind of Platonic realm of logical and mathematical entities. This is a type of physical concept which applies to the various systems in this world, and it is independent of whatever anyone else thinks. For instance, “a specific radium atom could probably be decayed within 10,000 years of its existence” (Khrennikov, 2020).

Based on this survey, the probability that a randomly selected American adult has never been tested, is calculated by using the data in the two total rows from the following table;

Age group Tested Never Tested

18 – 44 years 50,080 56,405

45 – 64 years 23,768 48,537

65 – 74 years 2,694 15,162

75 years and older 1,247 14,663

Total 77,789 134,767

The formula to use in this case is; P (NT) / (P (NT) + P (T)), where P is the probability. Therefore, the P (Not Tested) = 134,767 / (134,767 + 77,789) = 0.6340

The proportion of 18- to 44-year-old Americans that have never been tested for HIV, is calculated by using the values in the 18–44 cells. The formula to calculate this should be; P (NT 18-44 years) / P (NT 18-44 years) + P (T 18-44 years), where P is the probability.

Therefore, P (not tested (18-44 years)) = 56,405 / (56,405 + 50,080) = 0.5297

References

Khrennikov, A. (2020). Interpretations of Probability.

Shiryaev, Albertrn. (2019). PROBABILITY. Place of publication not identified: SPRINGER.

Todhunter, Isaac. (2016). History of the mathematical theory of probability from the time of pascal to that of laplace. Place of publication not identified: HANSEBOOKS.