Sampling methods and survey design help students understand how data is collected before any statistics are calculated. This cheat sheet explains how to choose samples fairly, write better survey questions, and recognize sources of bias. It is useful because poor sampling can make even accurate calculations misleading. Students in statistics need these tools to judge whether conclusions from data are trustworthy. The core ideas include identifying the population, choosing a sample, and using probability-based methods when possible. Important sampling methods include simple random sampling, stratified sampling, cluster sampling, and systematic sampling. Survey design focuses on avoiding biased wording, undercoverage, nonresponse, and voluntary response bias. Useful formulas include response rate, sampling fraction, and approximate margin of error for proportions.

Key Facts

  • In a simple random sample of size nn from a population of size NN, every possible group of nn individuals has the same chance of being selected.
  • The sampling fraction is nN\frac{n}{N}, where nn is the sample size and NN is the population size.
  • In stratified random sampling, split the population into similar groups called strata, then take a random sample from each stratum.
  • In cluster sampling, split the population into mixed groups called clusters, randomly choose some clusters, and survey everyone or many people inside those clusters.
  • In systematic sampling, choose a random starting point and then select every kkth person, where kNnk \approx \frac{N}{n}.
  • The response rate is number of responsesnumber contacted×100%\frac{\text{number of responses}}{\text{number contacted}} \times 100\%.
  • For a sample proportion p^\hat{p}, an approximate margin of error is 2p^(1p^)n2\sqrt{\frac{\hat{p}(1 - \hat{p})}{n}} for a rough 95%95\% confidence estimate.
  • Larger random samples usually reduce sampling variability because the standard error for a proportion is p^(1p^)n\sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}.

Vocabulary

Population
The entire group of individuals or items that a study wants to learn about.
Sample
A smaller group selected from the population to provide data for a study.
Simple Random Sample
A sample chosen so that every possible group of size nn has an equal chance of being selected.
Stratum
A subgroup of the population whose members share an important characteristic, such as grade level or age group.
Bias
A systematic problem in data collection that makes results consistently favor some outcomes over others.
Margin of Error
An estimate of how far a sample statistic, such as p^\hat{p}, may be from the true population value.

Common Mistakes to Avoid

  • Confusing random sampling with convenience sampling is wrong because choosing people who are easy to reach does not give every member of the population a fair chance.
  • Using a large biased sample is wrong because increasing nn does not fix undercoverage, leading questions, or voluntary response bias.
  • Treating a sample statistic as the exact population value is wrong because values such as p^\hat{p} vary from sample to sample.
  • Forgetting to define the population is wrong because the sample can only support conclusions about the group it was chosen to represent.
  • Using systematic sampling without checking for patterns is wrong because selecting every kkth person can be biased if the list has a repeating order.

Practice Questions

  1. 1 A school has 1,2001,200 students and wants a sample of 150150 students. What is the sampling fraction nN\frac{n}{N}?
  2. 2 A survey contacts 500500 adults, and 320320 respond. Find the response rate using responsescontacted×100%\frac{\text{responses}}{\text{contacted}} \times 100\%.
  3. 3 A population has 2,4002,400 people, and a researcher wants a systematic sample of 120120 people. Estimate the interval kNnk \approx \frac{N}{n}.
  4. 4 A website poll asks visitors whether homework should be optional. Explain why this survey may suffer from voluntary response bias and what population it may fail to represent.