Hypothesis Testing Calculator

Select a test type, enter your summary statistics, and get the test statistic, p-value, critical values, and a clear reject or fail-to-reject conclusion. All calculations run in your browser.

Test Type

Tail Direction

Significance Level (α)

Input Data

Sample mean (x̄)

Population mean (μ₀)

Population std dev (σ)

Sample size (n)

Reference Guide

What is Hypothesis Testing

Hypothesis testing is a statistical method for making decisions about a population based on sample data. You start with a null hypothesis ( $H_0$ ) that represents the status quo, and an alternative hypothesis ( $H_a$ ) that represents what you want to test.

The logic Assume

H_0

is true. If the observed data would be very unlikely under

H_0

, reject it in favor of

H_a

Type I error Rejecting

H_0

when it is actually true. The probability is

\alpha

(the significance level).

Type II error Failing to reject

H_0

when it is actually false. The probability is

\beta

Z-Test vs T-Test

Both tests compare a sample mean to a hypothesized value. The difference is in what you know about the population.

Z-test Use when the population standard deviation

\sigma

is known (rare in practice).

z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}

T-test Use when you estimate the spread from the sample standard deviation

s

. The t-distribution has heavier tails, which accounts for the extra uncertainty.

t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}, \quad df = n - 1

As the sample size grows, the t-distribution approaches the standard normal, so the two tests give nearly identical results for large $n$ .

P-Values and Critical Values

P-value The probability of observing a test statistic as extreme as (or more extreme than) the one computed, assuming

H_0

is true. A small p-value means the data is unlikely under

H_0

Decision rule If p-value

< \alpha

, reject

H_0

. Otherwise, fail to reject.

Critical value approach Find the boundary value(s) that mark the rejection region. If the test statistic falls in the rejection region, reject

H_0

Both approaches always give the same conclusion. The p-value approach is more informative because it tells you exactly how much evidence there is against $H_0$ .

Chi-Square Goodness of Fit

Tests whether observed frequencies match an expected distribution. For example, testing if a die is fair or if survey responses match expected proportions.

Test statistic

\chi^2 = \sum_{i=1}^{k} \frac{(O_i - E_i)^2}{E_i}

Always right-tailed Large values of

\chi^2

indicate a poor fit, so the rejection region is always in the right tail.

Rule of thumb Each expected count should be at least 5 for the chi-square approximation to be reliable.

Degrees of freedom

df = k - 1

where

k

is the number of categories.