P-Values Explained
Hypothesis Testing and Statistical Significance
Related Tools
Related Labs
Related Worksheets
A p-value is a way to measure how surprising your data would be if a null hypothesis were true. It is used in hypothesis testing to help decide whether an observed result is unusual enough to challenge a default assumption. P-values matter because they connect data, probability, and decision making in science, medicine, engineering, and social research. A small p-value suggests that the observed result would be rare under the null hypothesis, but it does not prove that the alternative hypothesis is true.
In many tests, the p-value is shown as a shaded tail area under a probability distribution, such as a bell curve. The test statistic tells you where your result falls on that distribution, and the p-value is the probability of getting a result at least that extreme if the null hypothesis is correct. Researchers often compare the p-value to a significance level, such as alpha = 0.05, to decide whether to reject the null hypothesis. This process must be paired with good study design, effect size, and context because statistical significance is not the same as practical importance.
Key Facts
- A p-value is P(data at least as extreme as observed | null hypothesis is true).
- If p <= alpha, reject the null hypothesis; if p > alpha, fail to reject the null hypothesis.
- A common significance level is alpha = 0.05, but alpha should be chosen before looking at the data.
- For a z-test, z = (x̄ - μ0) / (σ / sqrt(n)).
- In a two-tailed z-test, p-value = 2P(Z >= |z|).
- A smaller p-value means the data are less compatible with the null hypothesis, not that the null hypothesis has a small probability of being true.
Vocabulary
- P-value
- The probability of observing data at least as extreme as the sample result, assuming the null hypothesis is true.
- Null hypothesis
- The default claim being tested, often stating that there is no effect, no difference, or no relationship.
- Alternative hypothesis
- The claim that competes with the null hypothesis, often stating that an effect, difference, or relationship exists.
- Significance level
- The cutoff probability, called alpha, used to decide whether a p-value is small enough to reject the null hypothesis.
- Test statistic
- A standardized number, such as z or t, that shows how far the sample result is from the null hypothesis value.
Common Mistakes to Avoid
- Saying the p-value is the probability that the null hypothesis is true. This is wrong because the p-value assumes the null hypothesis is true and then measures how unusual the data are under that assumption.
- Treating p = 0.049 and p = 0.051 as completely different results. This is wrong because the cutoff alpha is a decision rule, while the evidence changes gradually.
- Thinking a small p-value means a large or important effect. This is wrong because very large samples can make tiny effects statistically significant.
- Choosing alpha after seeing the p-value. This is wrong because changing the cutoff after observing the data increases the chance of misleading conclusions.
Practice Questions
- 1 A one-tailed z-test gives z = 1.96. Using P(Z >= 1.96) = 0.025, what is the p-value, and would you reject H0 at alpha = 0.05?
- 2 A two-tailed z-test gives z = -2.40. Using P(Z >= 2.40) = 0.0082, find the p-value and decide whether the result is significant at alpha = 0.01.
- 3 A study reports p = 0.03 for a new teaching method, but the average test score improved by only 0.5 points. Explain why this result may be statistically significant but not practically important.