To ensure that your results are valid, consider the following guidelines when you collect data, perform the analysis, and interpret your results.
- The data should include at least one categorical factor
The categorical factors can be crossed and nested factors, and fixed and random factors.
- For a model with random factors, you usually use Fit Mixed Effects
Model so that you can use the Restricted Maximum Likelihood estimation method (REML).
- If you have one categorical factor and no continuous predictors, you can also use One-Way
- If you have primarily continuous predictor variables, you can get similar model results with Fit Regression
- If you have one or two categorical factors and want to compare the level means to the overall mean for data that follow the normal, binomial, or Poisson distributions, use Analysis of
- If you want to test the equality of the standard deviations between groups, use Test for Equal
For more information on factors, go to Factors and factor levels, What are factors, crossed factors, and nested factors?, and What is the difference between fixed and random factors?.
- The response variable should be continuous
- If the response variable is categorical, your model is less likely to meet the assumptions of the analysis, to accurately describe your data, or to make useful predictions.
- If you have multiple response variables that are correlated and a common set of factors, use General
MANOVA, which has more power and can detect multivariate response patterns.
- If your response variable has two categories, such as pass and fail, use Fit Binary Logistic
- If your response variable contains three or more categories that have a natural order, such as strongly disagree, disagree, neutral, agree, and strongly agree, use Ordinal Logistic
- If your response variable contains three or more categories that do not have a natural order, such as scratch, dent, and tear, use Nominal Logistic
- If your response variable counts occurrences, such as the number of defects, use Fit Poisson
- Each observation should be independent from all other observations
If your observations are dependent, your results might not be valid. Consider the following points to determine whether your observations are independent:
- If an observation provides no information about the value of another observation, the observations are independent.
- If an observation provides information about another observation, the observations are dependent.
- The sample data should be selected randomly
Random samples are used to make generalizations, or inferences, about a population. If your data were not collected randomly, your results might not represent the population.
- Collect data using best practices
To ensure that your results are valid, consider the following guidelines:
- Make certain that the data represent the population of interest.
- Collect enough data to provide the necessary precision.
- Measure variables as accurately and precisely as possible.
- Record the data in the order it is collected.
- The correlation among the predictors, also known as multicollinearity, should not be severe
If multicollinearity is severe, you might not be able to determine which predictors to include in the model. To determine the severity of the multicollinearity, use the variance inflation factors (VIF) in the Coefficients table of the output.
- The model should provide a good fit to the data
If the model does not fit the data, the results can be misleading. In the output, use the residual plots, the diagnostic statistics for unusual observations, and the model summary statistics to determine how well the model fits the data.