Which test should I base my conclusion on?

By default, Minitab's 2 Variances displays results for Levene's method and Bonett's method. For most continuous distributions, both methods give you a type I error rate that is close to your specified significance level (also known as alpha or α). Bonett's method is usually more powerful, so you should base your conclusions on the results for Bonett's method, unless the following are true:
  • Your samples have less than 20 observations each.
  • The distribution for one or more of the populations is extremely skewed or has heavy tails. Compared to the normal distribution, a distribution with heavy tails has more data at its lower and upper ends.

When you have small samples from very skewed, or heavy-tailed distributions, the type I error rate for Bonett's method can be higher than α. Under these conditions, if Levene's method gives you a smaller confidence interval than Bonett's method, then you should base your conclusions on Levene's method. Otherwise, you can base your conclusions on Bonett's method, but remember that your type I error rate is likely to be greater than α.

Calculations for Bonett's method and Levene's method

The computational method for Levene's test is based on the Brown and Forsythe modification of Levene's procedure. This method considers the distances of the observations from their sample median rather than their sample mean. Using the sample median rather than the sample mean makes the test more robust for smaller samples.

The computational method for the Bonett confidence intervals are based on Bonett1. The confidence intervals proposed in that article, however, are not correct because they are based on a pooled estimate of the kurtosis which is inconsistent when the standard deviations of the populations are unequal. Minitab uses an alternative computational algorithm to correct this error. The Bonett p-value is calculated by inverting the corrected confidence intervals.

Bonett D. G. (2006). Robust Confidence Interval for a Ratio of Standard Deviations. Applied Psychological Measurements, 30, 432–439

The F-test

Instead of Bonett's method and Levene's method, you can choose to display results for the test based on the normal distribution, also called the F-test. Minitab also displays results for the F-test if you enter summary data for the size and variance (or standard deviation) for each sample.

The F-test is accurate only for normally distributed data. Any minor departure from normality can cause this test to yield inaccurate results. But if the data conform to the normal distribution, then the F-test is typically more powerful than either the Bonett's method or Levene's method. However, the F-test is usually not practically useful because data are rarely perfectly normally distributed.