Bayesian Statistics Basics Cheat Sheet

Bayesian statistics is a way to update probabilities when new evidence is observed. This cheat sheet helps students connect conditional probability to real statistical reasoning. It is especially useful for interpreting test results, predictions, and uncertain claims using evidence.

Students need it to see how prior beliefs and data combine in a clear mathematical process.

The central formula is Bayes’ theorem, $P(A\mid B)=\frac{P(B\mid A)P(A)}{P(B)}$ . In Bayesian thinking, the prior probability represents what is believed before seeing the data, the likelihood measures how compatible the data are with a hypothesis, and the posterior probability is the updated belief. The denominator, often called evidence, makes the probabilities add to $1$ .

Bayesian updating can be repeated as new data arrive.

Key Facts

Bayes’ theorem is $P(A\mid B)=\frac{P(B\mid A)P(A)}{P(B)}$ , where $P(A\mid B)$ is the updated probability of event $A$ after observing event $B$ .
The prior probability is $P(H)$ , which represents the probability of a hypothesis $H$ before observing new data.
The likelihood is $P(D\mid H)$ , which represents the probability of observing data $D$ if hypothesis $H$ is true.
The posterior probability is $P(H\mid D)=\frac{P(D\mid H)P(H)}{P(D)}$ , which represents the updated probability after observing data $D$ .
The evidence can be found by total probability: $P(D)=P(D\mid H)P(H)+P(D\mid H^c)P(H^c)$ for a hypothesis $H$ and its complement $H^c$ .
Posterior odds are prior odds multiplied by the likelihood ratio: $\frac{P(H\mid D)}{P(H^c\mid D)}=\frac{P(H)}{P(H^c)}\cdot\frac{P(D\mid H)}{P(D\mid H^c)}$ .
If events $A$ and $B$ are independent, then $P(A\mid B)=P(A)$ , so observing $B$ does not change the probability of $A$ .
Bayesian updating can be repeated because today’s posterior probability can become tomorrow’s prior probability.

Vocabulary

Prior probability: The probability assigned to a hypothesis before considering the new data.
Likelihood: The probability of observing the data assuming a particular hypothesis is true.
Posterior probability: The updated probability of a hypothesis after using Bayes’ theorem with the observed data.
Evidence: The overall probability of the observed data across all possible hypotheses.
Conditional probability: The probability that one event occurs given that another event has already occurred.
Likelihood ratio: A comparison of how likely the data are under one hypothesis versus another hypothesis.

Common Mistakes to Avoid

Confusing $P(A\mid B)$ with $P(B\mid A)$ is wrong because the condition changes the group being considered, and the two probabilities are usually not equal.
Ignoring the prior $P(H)$ is wrong because Bayesian statistics always combines prior information with the likelihood from new data.
Forgetting to divide by the evidence $P(D)$ is wrong because the posterior probabilities must be normalized so they add to $1$ across all hypotheses.
Treating a high test accuracy as the same as a high posterior probability is wrong because the base rate, or prior probability, can strongly affect $P(H\mid D)$ .
Using $P(D)=P(D\mid H)+P(D\mid H^c)$ is wrong because each likelihood must be weighted by its prior, giving $P(D)=P(D\mid H)P(H)+P(D\mid H^c)P(H^c)$ .

Practice Questions

1 A disease affects $2\%$ of a population. A test is positive $95\%$ of the time for people with the disease and $10\%$ of the time for people without it. Find $P(\text{disease}\mid \text{positive})$ .
2 A factory machine makes $30\%$ of all parts, and $4\%$ of its parts are defective. The other machine makes $70\%$ of all parts, and $1\%$ of its parts are defective. Find the probability a defective part came from the first machine.
3 A hypothesis has prior probability $P(H)=0.25$ . The data have likelihoods $P(D\mid H)=0.8$ and $P(D\mid H^c)=0.2$ . Find $P(H\mid D)$ .
4 Explain why a rare condition can still have a low posterior probability after a positive test result, even if the test is fairly accurate.

Understanding Bayesian Statistics Basics

A useful way to understand Bayesian reasoning is to separate two directions of probability. A medical test can be very likely to give a positive result when a disease is present. That does not mean a person with a positive result is very likely to have the disease.

The missing factor is how common the disease was before testing. When a condition is rare, there may be many more healthy people than affected people.

Even a small false positive rate can then create a noticeable number of positive results among healthy people. This is called the base rate effect.

Consider a condition that affects one person in every one thousand. Suppose a screening test finds ninety nine out of every one hundred affected people, but gives a positive result for one out of every one hundred healthy people. In a group of one hundred thousand people, about one hundred have the condition.

The test correctly gives about ninety nine positive results in that group. Among the remaining people, about nine hundred ninety nine healthy people receive positive results by mistake. A positive result is therefore not close to certain proof.

It gives important evidence, yet follow up testing is needed. This pattern appears in airport screening, plagiarism detectors, fraud alerts, and spam filters.

The evidence term has an important job because it counts every reasonable way the observed data could occur. For a positive test, the result may come from a person who has the condition or from a person who does not. Dividing by this overall chance turns the calculation into a fair comparison between the competing explanations.

A likelihood by itself is not the probability that a hypothesis is true. It only describes how expected the data would be if that hypothesis were true.

Students often reverse these two ideas. Keeping the phrases probability of data given hypothesis and probability of hypothesis given data distinct prevents this common error.

Bayesian methods depend on the quality of the starting assumptions and the data model. A prior should come from relevant past information, not from a guess chosen to force a preferred answer. The likelihood should reflect real test accuracy or a defensible model of measurement error.

If either part is poor, the final result can be misleading even when the arithmetic is correct. Independent evidence is especially valuable. Repeating the same weak measurement does not provide as much new information as using a separate source that has different errors.

In class problems, list the possible groups, use actual counts when possible, and check whether the final probability fits the situation. A result near one hundred percent after a weak test for a rare event deserves careful checking.

Sign in to save

Sign in to save

Bayesian Statistics Basics Cheat Sheet

Related Tools

Related Labs

Related Worksheets

Related Infographics

Study as Flashcards

Key Facts

Vocabulary

Common Mistakes to Avoid

Practice Questions

Understanding Bayesian Statistics Basics