Data considerations for Chi-Square Goodness-of-Fit Test

To ensure that your results are valid, consider the following guidelines when you collect data, perform the analysis, and interpret your results.

The sample should be selected randomly: Random samples are used to make generalizations, or inferences, about a population. If your data are not collected randomly, your results may not be valid.
The variable should be categorical: Categorical variables contain a finite, countable number of categories or distinct groups. Categorical data might not have a logical order. For example, categorical variables include gender, material type, and payment method.
The expected counts for each category must not be too small: Each sample should be large enough so that there is a reasonable chance of observing outcomes in every category. If the expected counts are too low, the p-value for the test may not be accurate. Minitab indicates, in your results, whether the expected counts are too low.; If the expected count for a category is too low, you may be able to combine that category with adjacent categories to achieve the minimum expected count. You should combine categories only when necessary because you lose information when you combine categories.