Why use an equivalence test?

You can use an equivalence test to determine whether the means for product measurements or process measurements are close enough to be considered equivalent. Equivalence tests differ from standard t-tests in two important ways.

The burden of proof is placed on proving equivalence

In a standard t-test of the means, the null hypothesis assumes that the population mean is the same as a target value or another population mean. Thus, the burden of proof falls on proving that the mean differs from a target or another population mean. In equivalence testing, the null hypothesis is that the population mean differs from a target value or other population mean. Thus, the burden of proof is placed on proving that the mean is the same as a target or another population mean.

For example, consider the difference between a 2-sample t-test and a 2-sample equivalence test. You use a 2-sample t-test to test whether the means of two populations are different. The hypotheses for the test are as follows:

Null hypothesis (H₀): The means of the two populations are the same.
Alternative hypothesis (H₁): The means of the two populations are different.

If the p-value for the test is less than alpha (α), then you reject the null hypothesis and conclude that the means are different.

In contrast, you use a 2-sample equivalence test to test whether the means of two populations are equivalent. Equivalence for the test is defined by a range of values that you specify (also called the equivalence interval). The hypotheses for the test are as follows:

Null hypothesis (H₀): The difference between the means is outside your equivalence interval. The means are not equivalent.
Alternative hypothesis (H₁): The difference between the means is inside your equivalence interval. The means are equivalent.

If the p-value for the test is less than α, then you reject the null hypothesis and conclude that the means are equivalent.

The user defines a range of acceptable values for the difference

Small differences between products are not always functionally or practically important. For example, a difference of 1 mg in a 200 mg dose of a drug is unlikely to have any practical effect. When you use an equivalence test, you must enter equivalence limits that indicate how large the difference must be to be considered important. Smaller differences, which are within your equivalence limits, are considered unimportant. In this way, an equivalence test evaluates both the practical significance and statistical significance of a difference from the population mean.

To choose between an equivalence test and a standard t-test, consider what you hope to prove or demonstrate. If you want to prove that two means are equal, or that a mean equals a target value, and if you can define exactly what size difference is important in your field, you may want to use an equivalence test instead of a standard t-test.