Statistical and practical significance

The difference between a sample statistic and a hypothesized value is statistically significant if a hypothesis test indicates it is too unlikely to have occurred by chance. To assess statistical significance, examine the test's p-value. If the p-value is less than a specified significance level (α) (usually 0.10, 0.05, or 0.01), you can declare the difference to be statistically significant and reject the test's null hypothesis.

For example, suppose you want to determine whether the thickness of car windshields is larger than 4mm, as required by safety rules. You take a sample of windshields and conduct a 1-sample t-test with an α of 0.05 and the following hypotheses:
  • H0: μ = 4
  • H1: μ > 4
If the test produces a p-value of 0.001, you declare statistical significance and reject the null hypothesis because the p-value is less than α. You conclude in favor of the alternative hypothesis: that the windshield thickness is greater than 4mm.

But if the p-value equals 0.50, you cannot claim statistical significance. You do not have enough evidence to claim that the average windshield thickness is larger than 4mm.

A statistically significant result might not be practically significant

Statistical significance itself doesn't imply that your results have practical consequence. If you use a test with very high power, you might conclude that a small difference from the hypothesized value is statistically significant. However, that small difference might be meaningless to your situation. You should use your specialized knowledge to determine whether the difference is practically significant.

For example, suppose you are testing whether the population mean (μ) for hours worked at a manufacturing plant is equal to 8. If μ is not equal to 8, the power of your test approaches 1 as the sample size increases, and the p-value approaches 0.

With enough observations, even trivial differences between hypothesized and actual parameter values are likely to become significant. For example, suppose the actual value of mu is 7 hours, 59 minutes, and 59 seconds. With a large enough sample, you will most likely reject the null hypothesis that μ is equal to 8 hours, even though the difference is of no practical importance.

Confidence intervals (if applicable) are often more useful than hypothesis tests because they provide a way to assess practical significance in addition to statistical significance. They help you determine what a parameter value is, instead of what it is not.