Back to Student Worksheet
Statistics Grade advanced Answer Key

Statistics: Sampling (Advanced)

Designing samples, identifying bias, and analyzing sampling distributions

Answer Key
Name:
Date:
Score: / 15

Statistics: Sampling (Advanced)

Designing samples, identifying bias, and analyzing sampling distributions

Statistics - Grade advanced

Instructions: Read each problem carefully. Show your reasoning, calculations, and conclusions in the space provided.
  1. 1

    A university has 18,000 students: 9,000 undergraduates, 6,000 master's students, and 3,000 doctoral students. A researcher wants a stratified random sample of 600 students proportional to enrollment level. How many students should be sampled from each stratum?

    Use each stratum's share of the total population.

    The proportional allocations are 300 undergraduates, 200 master's students, and 100 doctoral students. These come from multiplying 600 by 9000/18000, 6000/18000, and 3000/18000.
  2. 2

    A city planner surveys every 25th household from a randomly chosen starting point on a complete list of households. Identify the sampling method and state one condition needed for it to produce an approximately unbiased sample.

    This is systematic sampling. It can produce an approximately unbiased sample if the household list has no periodic pattern related to the variables being studied.
  3. 3

    A health agency randomly selects 20 schools from a state and surveys every student in those selected schools. Identify the sampling method. Explain one advantage and one possible disadvantage of this method.

    In cluster sampling, whole groups are selected and everyone in those groups may be measured.

    This is cluster sampling. An advantage is that it can reduce cost and travel time because data are collected in selected schools only. A disadvantage is that students within the same school may be similar, which can increase sampling variability compared with a simple random sample of students.
  4. 4

    A pollster uses random digit dialing to estimate voter support for a policy. Only 12 percent of contacted people complete the survey. Name the main type of potential bias and explain why it matters.

    The main potential bias is nonresponse bias. It matters because the people who choose to respond may differ systematically from those who do not respond, so the sample estimate may not represent the population of voters.
  5. 5

    A simple random sample of 400 adults finds that 228 support a new transportation plan. Compute the sample proportion and an approximate standard error for the sample proportion, assuming the population is large.

    For a sample proportion, use sqrt(p-hat times (1 minus p-hat) divided by n).

    The sample proportion is 228/400 = 0.57. The approximate standard error is sqrt(0.57(0.43)/400), which is about 0.0248.
  6. 6

    A researcher samples 100 people without replacement from a population of 500 and measures a quantitative variable with population standard deviation 12. Compute the standard error of the sample mean using the finite population correction.

    Use sigma divided by sqrt(n), then multiply by sqrt((N - n)/(N - 1)).

    The uncorrected standard error is 12/sqrt(100) = 1.2. The finite population correction is sqrt((500 - 100)/(500 - 1)), which is about 0.896. The corrected standard error is about 1.2 times 0.896 = 1.08.
  7. 7

    A company wants to estimate average employee satisfaction. It samples 50 employees from each department, even though department sizes are very different. Is this proportional stratified sampling, equal allocation stratified sampling, or cluster sampling? Explain.

    This is equal allocation stratified sampling because the company samples the same number of employees from each department. It is not proportional stratified sampling because the sample sizes do not match department sizes, and it is not cluster sampling because only some employees within each department are sampled.
  8. 8

    A sample mean from a simple random sample has expected value equal to the population mean. What property does this describe? Explain what it does and does not guarantee.

    Think about the long-run average behavior of the estimator.

    This describes unbiasedness of the sample mean as an estimator of the population mean. It means that over many repeated samples, the average of the sample means equals the population mean. It does not guarantee that any one sample mean will be close to the population mean.
  9. 9

    A stratified sample is planned for a population with two strata. Stratum A has 1,000 units and standard deviation 20. Stratum B has 4,000 units and standard deviation 5. Explain why Neyman allocation might sample more heavily from Stratum A than proportional allocation would.

    Neyman allocation considers both stratum size and within-stratum standard deviation.

    Neyman allocation assigns more sample to strata with greater variability as well as larger size. Even though Stratum A is smaller, its standard deviation is much larger, so sampling it more heavily can reduce the overall variance of the stratified estimator.
  10. 10

    A journalist posts an online poll asking readers whether they support a tax increase. Readers choose whether to participate. Identify the sampling problem and explain the likely consequence.

    The sampling problem is voluntary response sampling. The likely consequence is biased results because people with strong opinions are more likely to participate, and the poll may not represent the broader population.
  11. 11

    A finite population contains 10,000 households. A simple random sample of 1,000 households is selected without replacement. Should the finite population correction be considered? Justify your answer using the sampling fraction.

    Compare n/N with 0.05.

    Yes, the finite population correction should be considered because the sampling fraction is 1000/10000 = 0.10, or 10 percent. A common rule is that the correction becomes important when the sampling fraction is more than about 5 percent.
  12. 12

    A diagram shows a population divided into four income groups. A fixed percentage is randomly sampled from each income group. What sampling design is shown, and why might it improve precision compared with a simple random sample of the same size?

    The design is proportional stratified random sampling. It might improve precision because each income group is guaranteed representation, which can reduce variability when income groups differ on the variable being measured.
  13. 13

    A researcher estimates the average commute time using a sample that overrepresents people who live near train stations. Explain how this coverage error could affect the estimate.

    This coverage error could make the estimate unrepresentative because people near train stations may have different commute patterns than the target population. For example, the estimated average commute time could be biased if these residents commute more often by transit or have shorter access times.
  14. 14

    A sampling distribution of a sample proportion is approximately normal with mean 0.40 and standard error 0.03. About what interval would contain roughly 95 percent of sample proportions under repeated sampling? State the rule you used.

    For an approximately normal distribution, use mean plus or minus about two standard errors.

    A rough 95 percent interval is 0.40 plus or minus 2(0.03), which is from 0.34 to 0.46. This uses the empirical rule that about 95 percent of values in a normal distribution lie within about two standard errors of the mean.
  15. 15

    A survey has a frame of 50,000 registered voters, but the target population is all voting-age residents. Explain the difference between the sampling frame and the target population, and name one group likely to be missed.

    The target population is all voting-age residents, while the sampling frame is the list of 50,000 registered voters used to select the sample. People who are eligible to vote but not registered are likely to be missed, which can create undercoverage bias.
LivePhysics™.com Statistics - Grade advanced - Answer Key