Sign in to save

Bookmark this page so you can find it later.

Sign in to save

Bookmark this page so you can find it later.

Bayesian statistics is a way to update probabilities when new evidence is observed. This cheat sheet helps students connect conditional probability to real statistical reasoning. It is especially useful for interpreting test results, predictions, and uncertain claims using evidence. Students need it to see how prior beliefs and data combine in a clear mathematical process. The central formula is Bayes’ theorem, P(AB)=P(BA)P(A)P(B)P(A\mid B)=\frac{P(B\mid A)P(A)}{P(B)}. In Bayesian thinking, the prior probability represents what is believed before seeing the data, the likelihood measures how compatible the data are with a hypothesis, and the posterior probability is the updated belief. The denominator, often called evidence, makes the probabilities add to 11. Bayesian updating can be repeated as new data arrive.

Key Facts

  • Bayes’ theorem is P(AB)=P(BA)P(A)P(B)P(A\mid B)=\frac{P(B\mid A)P(A)}{P(B)}, where P(AB)P(A\mid B) is the updated probability of event AA after observing event BB.
  • The prior probability is P(H)P(H), which represents the probability of a hypothesis HH before observing new data.
  • The likelihood is P(DH)P(D\mid H), which represents the probability of observing data DD if hypothesis HH is true.
  • The posterior probability is P(HD)=P(DH)P(H)P(D)P(H\mid D)=\frac{P(D\mid H)P(H)}{P(D)}, which represents the updated probability after observing data DD.
  • The evidence can be found by total probability: P(D)=P(DH)P(H)+P(DHc)P(Hc)P(D)=P(D\mid H)P(H)+P(D\mid H^c)P(H^c) for a hypothesis HH and its complement HcH^c.
  • Posterior odds are prior odds multiplied by the likelihood ratio: P(HD)P(HcD)=P(H)P(Hc)P(DH)P(DHc)\frac{P(H\mid D)}{P(H^c\mid D)}=\frac{P(H)}{P(H^c)}\cdot\frac{P(D\mid H)}{P(D\mid H^c)}.
  • If events AA and BB are independent, then P(AB)=P(A)P(A\mid B)=P(A), so observing BB does not change the probability of AA.
  • Bayesian updating can be repeated because today’s posterior probability can become tomorrow’s prior probability.

Vocabulary

Prior probability
The probability assigned to a hypothesis before considering the new data.
Likelihood
The probability of observing the data assuming a particular hypothesis is true.
Posterior probability
The updated probability of a hypothesis after using Bayes’ theorem with the observed data.
Evidence
The overall probability of the observed data across all possible hypotheses.
Conditional probability
The probability that one event occurs given that another event has already occurred.
Likelihood ratio
A comparison of how likely the data are under one hypothesis versus another hypothesis.

Common Mistakes to Avoid

  • Confusing P(AB)P(A\mid B) with P(BA)P(B\mid A) is wrong because the condition changes the group being considered, and the two probabilities are usually not equal.
  • Ignoring the prior P(H)P(H) is wrong because Bayesian statistics always combines prior information with the likelihood from new data.
  • Forgetting to divide by the evidence P(D)P(D) is wrong because the posterior probabilities must be normalized so they add to 11 across all hypotheses.
  • Treating a high test accuracy as the same as a high posterior probability is wrong because the base rate, or prior probability, can strongly affect P(HD)P(H\mid D).
  • Using P(D)=P(DH)+P(DHc)P(D)=P(D\mid H)+P(D\mid H^c) is wrong because each likelihood must be weighted by its prior, giving P(D)=P(DH)P(H)+P(DHc)P(Hc)P(D)=P(D\mid H)P(H)+P(D\mid H^c)P(H^c).

Practice Questions

  1. 1 A disease affects 2%2\% of a population. A test is positive 95%95\% of the time for people with the disease and 10%10\% of the time for people without it. Find P(diseasepositive)P(\text{disease}\mid \text{positive}).
  2. 2 A factory machine makes 30%30\% of all parts, and 4%4\% of its parts are defective. The other machine makes 70%70\% of all parts, and 1%1\% of its parts are defective. Find the probability a defective part came from the first machine.
  3. 3 A hypothesis has prior probability P(H)=0.25P(H)=0.25. The data have likelihoods P(DH)=0.8P(D\mid H)=0.8 and P(DHc)=0.2P(D\mid H^c)=0.2. Find P(HD)P(H\mid D).
  4. 4 Explain why a rare condition can still have a low posterior probability after a positive test result, even if the test is fairly accurate.