The Central Limit Theorem explains one of the most important patterns in statistics: averages from many kinds of populations tend to behave in a predictable way. Even if the original data are skewed, flat, or irregular, the distribution of sample means becomes more normal as sample size increases. This matters because many statistical methods rely on normality when estimating uncertainty and making decisions from data. It gives us a bridge from messy real-world data to clean mathematical models.

The theorem applies when we repeatedly take random samples of size nn from a population with a finite mean and finite standard deviation. If we compute the sample mean for each sample, the distribution of those means has mean μ\mu and standard deviation σn\frac{\sigma}{\sqrt{n}}, often called the standard error. As nn grows, that sampling distribution approaches a normal distribution, regardless of the population shape in many practical cases. This is why larger samples usually produce more stable averages and more reliable inference.

Key Facts

  • If XX has population mean μ\mu and standard deviation σ\sigma, then the sample mean xˉ\bar{x} has mean μxˉ=μ\mu_{\bar{x}} = \mu.
  • The standard deviation of the sample mean is σxˉ=σn\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}.
  • For large nn, the standardized sample mean is approximately normal: Z=xˉμσ/nZ = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}}.
  • The theorem describes the distribution of sample means, not the distribution of individual observations.
  • As sample size n increases, the sampling distribution of x_bar becomes more nearly normal and more narrowly spread.
  • A larger population standard deviation σ\sigma makes sample means more variable, while a larger sample size nn reduces that variability.

Vocabulary

Population distribution
The population distribution is the pattern of all individual values in the full group being studied.
Sample mean
The sample mean is the average of the values in one sample, usually written as x_bar.
Sampling distribution
A sampling distribution is the distribution of a statistic, such as x_bar, across many repeated random samples.
Standard error
The standard error is the standard deviation of a sampling distribution, and for the sample mean it is σn\frac{\sigma}{\sqrt{n}}.
Normal distribution
A normal distribution is a symmetric bell-shaped distribution described by its mean\text{mean} and standard deviation\text{standard deviation}.

Common Mistakes to Avoid

  • Thinking the theorem says the original population becomes normal, which is wrong because the theorem is about the distribution of sample means, not the raw data.
  • Using σ/n\sigma/\sqrt{n} as the spread of individual observations, which is wrong because σ/n\sigma/\sqrt{n} is the standard error of xˉ\bar{x}, not the standard deviation of the population.
  • Assuming any tiny sample size guarantees a normal sampling distribution, which is wrong because strongly skewed or unusual populations may need a larger n for the approximation to work well.
  • Ignoring the need for random and independent sampling, which is wrong because dependence between observations can break the conditions needed for the theorem.

Practice Questions

  1. 1 A population has mean μ=50\mu = 50 and standard deviation σ=12\sigma = 12. If samples of size n=36n = 36 are taken, what are the mean and standard error of the sample mean distribution?
  2. 2 A population has μ=80\mu = 80 and σ=20\sigma = 20. For samples of size n=25n = 25, what is the zz-score for a sample mean of xˉ=86\bar{x} = 86 using Z=(xˉμ)/(σ/n)Z = (\bar{x} - \mu)/(\sigma/\sqrt{n})?
  3. 3 A population is strongly right-skewed. Explain why the distribution of sample means can still be approximately normal when the sample size is large, and state what happens to its spread as n increases.