Data considerations for Analysis of Means

To ensure that your results are valid, consider the following guidelines when you collect data, perform the analysis, and interpret your results.

The response data should follow a normal, binomial, or Poisson distribution
  • Normally distributed data are typically measurement data, such as weight. With normally distributed data, Minitab compares the mean of each group to the overall mean.
  • Binomial data classifies each observation into one of two categories, such as pass/fail. With binomial data, Minitab compares the proportion of each sample to the overall proportion.
  • Poisson data contain counts, such as the number of defects per unit or sample. With Poisson data, Minitab compares the rate of occurrence for each sample to the overall rate.
If your data follow a normal distribution, the data should include one or two categorical factors
  • Analysis of means designs that have two factors must have a balanced design. A balanced design has an equal number of observations for all possible combinations of factor levels. If your design has two categorical factors and is unbalanced, use Fit General Linear Model if you have all fixed factors or Fit Mixed Effects Model if you have random factors.
  • If your design has more than two categorical factors or includes covariates, use Fit General Linear Model if you have all fixed factors or Fit Mixed Effects Model if you have random factors.

For more information on factors and balanced designs, go to Factors and factor levels and Balanced and unbalanced designs in ANOVA models.

If you have binomial data, the sample size must be constant and sufficiently large
  • All samples must be the same size to ensure that the comparison of the proportion for each sample to the overall proportion is valid.
  • The sample size must be large enough to ensure that the normal distribution adequately approximates the binomial distribution because the decision limits are based on the normal distribution. The normal distribution is adequate when np > 5 and n(1 − p) > 5, where n is the sample size and p is the proportion of events.

If your samples do not meet these criteria, your results might not be valid.

If you have Poisson data, the sample size must be constant and sufficiently large
  • All samples must be the same size so that the rate per sample is valid.
  • The sample size must be large enough to ensure that the normal distribution adequately approximates the Poisson distribution because the decision limits are based on the normal distribution. The normal distribution is adequate when the mean is at least 5.

If your samples do not meet these criteria, your results might not be valid.

Each observation should be independent from all other observations
If your observations are dependent, your results might not be valid. Consider the following points to determine whether your observations are independent:
  • If an observation provides no information about the value of another observation, the observations are independent.
  • If an observation provides information about another observation, the observations are dependent.
The sample data should be selected randomly

Random samples are used to make generalizations, or inferences, about a population. If your data were not collected randomly, your results might not represent the population.

Collect data using best practices
To ensure that your results are valid, consider the following guidelines:
  • Make certain that the data represent the population of interest.
  • Collect enough data to provide the necessary precision.
  • Measure variables as accurately and precisely as possible.
  • Record the data in the order it is collected.