Moneyball, The Math of Sports Statistics
OBP, sabermetrics, WAR, and regression to the mean
Related Tools
Related Labs
Related Worksheets
Moneyball is the story of how baseball teams used statistics to find value that traditional scouting often missed. Instead of judging players mainly by appearance, reputation, or batting average, analysts focused on numbers that better predicted scoring and winning. This shift mattered because teams with smaller budgets could compete by identifying undervalued skills. It also showed how statistical thinking can change decisions in sports, business, and everyday life.
Sabermetrics studies baseball using data, probability, and models to estimate how much each action helps a team win. Metrics like on-base percentage, slugging percentage, and wins above replacement connect individual performance to run production and team success. Modern analysts also account for randomness, sample size, park effects, and regression toward the mean. The result is a data-driven approach to recruiting, contracts, lineups, and in-game strategy.
Key Facts
- Batting average = hits / at-bats
- On-base percentage = (hits + walks + hit by pitch) / (at-bats + walks + hit by pitch + sacrifice flies)
- Slugging percentage = total bases / at-bats
- OPS = on-base percentage + slugging percentage
- Expected value = sum of each outcome probability times its value
- Regression toward the mean means extreme performance is likely to move closer to a player's true average over time
Vocabulary
- Sabermetrics
- Sabermetrics is the statistical study of baseball performance and strategy.
- On-base percentage
- On-base percentage measures how often a player reaches base by hit, walk, or being hit by a pitch.
- WAR
- Wins above replacement estimates how many wins a player adds compared with a readily available replacement-level player.
- Regression toward the mean
- Regression toward the mean is the tendency for unusually high or low results to be followed by results closer to the long-term average.
- Sample size
- Sample size is the number of observations used to calculate a statistic, such as plate appearances or innings pitched.
Common Mistakes to Avoid
- Using batting average as the only hitting measure is misleading because it ignores walks and the value of extra-base hits.
- Trusting a small sample size is risky because a hot week or cold week may reflect random variation rather than true skill.
- Assuming correlation proves causation is wrong because two statistics can move together without one directly causing the other.
- Ignoring context such as ballpark, league, defense, and role can distort player value because the same raw stat can mean different things in different conditions.
Practice Questions
- 1 A player has 150 hits in 500 at-bats. What is the player's batting average?
- 2 A player has 120 hits, 60 walks, 5 hit by pitches, 500 at-bats, and 5 sacrifice flies. What is the player's on-base percentage?
- 3 Player A has a .310 batting average and a .330 on-base percentage. Player B has a .260 batting average and a .380 on-base percentage. Explain which player a Moneyball-style analyst might value more and why.