Hypothesis Testing
Null, Alternative, p-Value & Decision
Related Labs
Related Worksheets
Related Cheat Sheets
Hypothesis testing is a statistical method for using sample data to evaluate a claim about a population. It helps scientists, doctors, engineers, and social researchers decide whether an observed effect is likely real or could have happened by random chance alone. The process gives a structured way to compare evidence against a default assumption. This makes conclusions more consistent and transparent.
In a hypothesis test, you begin with a null hypothesis that represents no effect or no difference, and an alternative hypothesis that represents the claim you want to investigate. After choosing a significance level , you calculate a test statistic from the sample and use it to find a -value. The -value measures how unusual the sample result would be if the null hypothesis were true. If the -value is small enough, you reject the null hypothesis; otherwise, you fail to reject it.
Key Facts
- Null hypothesis usually states no difference, no effect, or a specific population value.
- Alternative hypothesis states the competing claim, such as , , or .
- Decision rule: reject if -value ; otherwise fail to reject .
- A common significance level is , which is the probability of a Type I error.
- For a test of a mean, .
- For a one-sample test, .
Vocabulary
- Null hypothesis
- The starting claim that there is no effect, no difference, or no change in the population.
- Alternative hypothesis
- The competing claim that says there is an effect, a difference, or a change.
- p-value
- The probability of getting a result at least as extreme as the sample result if the null hypothesis is true.
- Significance level
- The cutoff probability used to decide when evidence is strong enough to reject the null hypothesis.
- Test statistic
- A standardized number computed from sample data that measures how far the sample result is from what the null hypothesis predicts.
Common Mistakes to Avoid
- Saying the p-value is the probability that is true, which is wrong because the p-value assumes is true and measures the probability of the observed data or more extreme data.
- Writing reject when p-value > , which is wrong because results larger than do not provide enough evidence against the null hypothesis.
- Confusing fail to reject with proving true, which is wrong because the test may simply lack enough evidence or sample size to detect a real effect.
- Choosing a one-tailed test after looking at the data, which is wrong because the direction of the test must be set before analysis to avoid biased conclusions.
Practice Questions
- 1 A factory claims the mean battery life is 20 hours. A sample of 36 batteries has mean 18.8 hours. Assume hours and test versus at . Compute the statistic and state the decision using the critical value method or p-value method.
- 2 A school tests whether a new tutoring program changes average test scores. For 25 students, the sample mean is 78, the hypothesized mean is 74, and the sample standard deviation is 10. Test versus at using a one-sample test. Compute the statistic and state whether to reject .
- 3 A study reports -value = 0.08 for a test conducted at . Explain what decision should be made and why this does not mean the null hypothesis has been proven true.