Multiple comparisons of means allow you to examine which means are different and to estimate by how much they are different. You can assess the statistical significance of differences between means using a set of confidence intervals, a set of hypothesis tests or both. The confidence intervals allow you to assess the practical significance of differences among means, in addition to statistical significance. As usual, the null hypothesis of no difference between means is rejected if and only if zero is not contained in the confidence interval.
The selection of the appropriate multiple comparison method depends on the inference that you want. It is inefficient to use the Tukey all-pairwise approach when Dunnett or MCB is suitable, because the Tukey confidence intervals will be wider and the hypothesis tests less powerful for a particular family error rate. For the same reasons, MCB is superior to Dunnett if you want to eliminate factor levels that are not the best and to identify those that are best or close to the best. The choice of Tukey versus Fisher's LSD methods depends on which error rate, family or individual, you want to specify.
| Method | Normal Data | Strength | Comparison with a Control | Pairwise Comparison | 
|---|---|---|---|---|
| Tukey | Yes | Most powerful test when doing all pairwise comparisons. | No | Yes | 
| Dunnett | Yes | Most powerful test when comparing to a control. | Yes | No | 
| Hsu's MCB method | Yes | The most powerful test when you compare the group with the highest or lowest mean to the other groups. | No | Yes | 
| Games-Howell | Yes | Used when you do not assume equal variances. | No | Yes | 
One-Way ANOVA also offers Fisher’s LSD method for individual confidence intervals. Fisher's is not a multiple comparison method, but instead contrasts the individual confidence intervals for the pairwise differences between means using an individual error rate. Fisher's LSD method inflates the family error rate, which is displayed in the output.
Choose Pairwise in the Options sub-dialog box when you do not have a control level and you want to compare all combinations of means.
Choose With a Control to compare the level means to the mean of a control group. When this method is suitable, it is inefficient to use pairwise comparisons because pairwise confidence intervals are wider and the hypothesis tests are less powerful for a given confidence level.
Choose the comparison procedure based on the group means that you want to compare, the type of confidence level that you want to specify, and how conservative you want the results to be. "Conservative" in this context indicates that the true confidence level is likely to be greater than the confidence level that is displayed.
Except for Fisher's method, the multiple comparison methods have protection against false positives built-in. By protecting against false positives with multiple comparisons, the intervals are wider than if there were no protection.
Some characteristics of the multiple comparison methods are summarized below:
| Comparison method | Properties | Confidence level that you specify | 
|---|---|---|
| Tukey | All pairwise comparisons only, not conservative | Simultaneous | 
| Fisher | No protection against false positives due to multiple comparisons | Individual | 
| Dunnett | Comparison to a control only, not conservative | Simultaneous | 
| Bonferroni | Most conservative | Simultaneous | 
| Sidak | Conservative, but slightly less than Bonferroni | Simultaneous | 
The p-value in the ANOVA table and the multiple comparison results are based on different methodologies and can occasionally produce contradictory results. For example, it is possible that the ANOVA p-value can indicate that there are no differences between the means while the multiple comparisons output indicates that some means that are different. In this case, you can generally trust the multiple comparisons output.
You do not need to rely on a significant p-value in the ANOVA table to reduce the chance of detecting a difference that doesn't exist. This protection is already incorporated in the Tukey, Dunnett, and MCB tests (and Fisher's test when the means are equal).