AP Statistics Formula Sheet Cheat Sheet

This AP Statistics formula sheet summarizes the major tools students use to describe data, model chance, and make conclusions from samples. It is useful for homework, unit review, and exam preparation because AP Statistics requires choosing the right formula for each situation. The sheet connects one-variable statistics, probability rules, inference procedures, and regression in one organized reference.

The most important ideas include using center and spread to describe distributions, applying probability rules to random events, and using sampling distributions to justify inference. Confidence intervals estimate unknown population values with a statistic plus a margin of error. Hypothesis tests compare observed results to a null model using a test statistic and a $P$ -value.

Regression formulas summarize linear relationships and support prediction when the conditions are appropriate.

Key Facts

The sample mean is $\bar{x}=\frac{\sum x_i}{n}$ , and it measures the average value in a sample.
The sample standard deviation is $s=\sqrt{\frac{\sum (x_i-\bar{x})^2}{n-1}}$ , and it measures typical distance from the sample mean.
For independent events, $P(A \cap B)=P(A)P(B)$ , and for conditional probability, $P(A\mid B)=\frac{P(A \cap B)}{P(B)}$ .
For a binomial random variable, $\mu_X=np$ and $\sigma_X=\sqrt{np(1-p)}$ when there are $n$ independent trials with success probability $p$ .
For a sampling distribution of a sample proportion, $\mu_{\hat{p}}=p$ and $\sigma_{\hat{p}}=\sqrt{\frac{p(1-p)}{n}}$ when the independence and large counts conditions are met.
A one-sample confidence interval has the general form $\text{statistic} \pm \text{critical value}\times \text{standard error}$ .
A common one-sample $z$ interval for a proportion is $\hat{p}\pm z^*\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$ .
The least-squares regression line is $\hat{y}=a+bx$ , where $b=r\frac{s_y}{s_x}$ and $a=\bar{y}-b\bar{x}$ .

Vocabulary

Parameter: A parameter is a number that describes an entire population, such as $\mu$ , $p$ , or $\sigma$ .
Statistic: A statistic is a number calculated from sample data, such as $\bar{x}$ , $\hat{p}$ , or $s$ .
Sampling Distribution: A sampling distribution is the distribution of a statistic over many random samples of the same size.
Standard Error: A standard error is an estimated standard deviation of a statistic, often used in confidence intervals and hypothesis tests.
P-value: A $P$ -value is the probability, assuming the null hypothesis is true, of getting a result at least as extreme as the observed result.
Correlation: Correlation $r$ measures the strength and direction of a linear relationship between two quantitative variables.

Common Mistakes to Avoid

Using $\sigma$ when only $s$ is known is wrong because $\sigma$ is a population standard deviation and $s$ is the sample estimate used in most real AP Statistics problems.
Forgetting to check conditions before inference is wrong because formulas such as $\hat{p}\pm z^*\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$ require random sampling, independence, and large counts.
Interpreting a confidence level as a probability about one fixed interval is wrong because a $95\%$ confidence level describes the long-run success rate of the method.
Using correlation to prove causation is wrong because a strong value of $r$ only describes linear association and does not show that one variable causes the other.
Mixing up $P(A\mid B)$ and $P(B\mid A)$ is wrong because the condition after the vertical bar defines the group whose probability is being measured.

Practice Questions

1 A sample has values $4$ , $7$ , $9$ , $10$ , and $15$ . Find $\bar{x}$ and identify whether $\bar{x}$ is a statistic or a parameter.
2 In a sample of $n=200$ students, $\hat{p}=0.62$ support a new schedule. Find the standard error $\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$ .
3 A regression analysis gives $r=0.80$ , $s_x=5$ , $s_y=12$ , $\bar{x}=30$ , and $\bar{y}=70$ . Find the slope $b=r\frac{s_y}{s_x}$ and intercept $a=\bar{y}-b\bar{x}$ .
4 A study finds that students who spend more hours studying tend to earn higher test scores. Explain why this association alone does not prove that studying time caused the higher scores.

Understanding AP Statistics Formula Sheet

A formula is only useful after you identify the type of variable and the source of the data. Start by deciding whether the variable is categorical or quantitative. Categorical data sorts people or objects into groups, such as preferred lunch option.

Quantitative data records numbers, such as commute time. Then identify whether the question concerns one group, two groups, paired measurements, or a relationship between two quantitative variables. This decision directs nearly every later choice.

A study about students surveyed from every tenth name on a school list has a different structure from an experiment that randomly assigns students to two study methods. Random assignment supports a cause and effect conclusion.

Random sampling supports a conclusion about a larger population. Neither feature can replace the other.

Conditions matter because formulas describe ideal random behavior, not every collection of numbers. Before using a procedure for proportions, check that observations are independent enough and that expected counts of successes and failures are large enough. For means, examine a graph for strong skewness or unusual outliers, especially when the sample is small.

A dotplot, histogram, boxplot, or residual plot can reveal problems that a calculator output hides. The ten percent condition is common when sampling without replacement. It means the sample should be no more than one tenth of the population.

This helps keep observations close to independent. Students often lose credit by giving an answer from a formula without stating whether the needed conditions were met.

Confidence intervals and hypothesis tests use similar ingredients but answer different jobs. An interval gives a range of plausible population values based on sample data. Its confidence level describes the long run success rate of the method, not the probability that one fixed population value moves around inside the calculated interval.

A hypothesis test begins with a claim used as a reference point. The test asks whether the sample result would be unusual if that reference claim were true. A small P value means the observed data, or data more extreme, would be rare under the null model.

It does not give the probability that the null claim is true. In real studies, a statistically significant result can still be too small to matter in practice. Always compare the size of an effect with the real decision being made.

Regression needs careful interpretation because a line can look convincing while telling an incomplete story. The slope describes the predicted change in the response for a one unit increase in the explanatory variable. Its units combine the units of both variables.

Predictions are most reliable within the range of observed x values. Using the line far outside that range is extrapolation, which can fail badly. A high correlation does not prove that one variable causes the other.

Hidden variables can influence both, such as temperature affecting ice cream sales and beach attendance. Check a residual plot for curves, changing spread, or isolated points. These patterns show that a straight line may not be an appropriate model.

When studying formulas, practice writing a conclusion in context. A correct numerical result needs a clear statement about the people, objects, or measurements in the study.

Sign in to save

Sign in to save

AP Statistics Formula Sheet Cheat Sheet

Related Tools

Related Labs

Related Worksheets

Related Infographics

Study as Flashcards

Key Facts

Vocabulary

Common Mistakes to Avoid

Practice Questions

Understanding AP Statistics Formula Sheet