All Tools

Hypothesis Testing Calculator

Select a test type, enter your summary statistics, and get the test statistic, p-value, critical values, and a clear reject or fail-to-reject conclusion. All calculations run in your browser.

Test Type

Tail Direction

Significance Level (α)

Input Data

Reference Guide

What is Hypothesis Testing

Hypothesis testing is a statistical method for making decisions about a population based on sample data. You start with a null hypothesis (H0H_0) that represents the status quo, and an alternative hypothesis (HaH_a) that represents what you want to test.

The logic Assume H0H_0 is true. If the observed data would be very unlikely under H0H_0, reject it in favor of HaH_a.
Type I error Rejecting H0H_0 when it is actually true. The probability is α\alpha (the significance level).
Type II error Failing to reject H0H_0 when it is actually false. The probability is β\beta.

Z-Test vs T-Test

Both tests compare a sample mean to a hypothesized value. The difference is in what you know about the population.

Z-test Use when the population standard deviation σ\sigma is known (rare in practice).
z=xˉμ0σ/nz = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}
T-test Use when you estimate the spread from the sample standard deviation ss. The t-distribution has heavier tails, which accounts for the extra uncertainty.
t=xˉμ0s/n,df=n1t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}, \quad df = n - 1

As the sample size grows, the t-distribution approaches the standard normal, so the two tests give nearly identical results for large nn.

P-Values and Critical Values

P-value The probability of observing a test statistic as extreme as (or more extreme than) the one computed, assuming H0H_0 is true. A small p-value means the data is unlikely under H0H_0.
Decision rule If p-value <α< \alpha, reject H0H_0. Otherwise, fail to reject.
Critical value approach Find the boundary value(s) that mark the rejection region. If the test statistic falls in the rejection region, reject H0H_0.

Both approaches always give the same conclusion. The p-value approach is more informative because it tells you exactly how much evidence there is against H0H_0.

Chi-Square Goodness of Fit

Tests whether observed frequencies match an expected distribution. For example, testing if a die is fair or if survey responses match expected proportions.

Test statistic
χ2=i=1k(OiEi)2Ei\chi^2 = \sum_{i=1}^{k} \frac{(O_i - E_i)^2}{E_i}
Always right-tailed Large values of χ2\chi^2 indicate a poor fit, so the rejection region is always in the right tail.
Rule of thumb Each expected count should be at least 5 for the chi-square approximation to be reliable.
Degrees of freedom df=k1df = k - 1 where kk is the number of categories.