Data considerations for Chi-Square Goodness-of-Fit Test

To ensure that your results are valid, consider the following guidelines when you collect data, perform the analysis, and interpret your results.

The sample should be selected randomly

In statistics, random samples are used to make generalizations, or inferences, about a population. If your data are not collected randomly, your results may not be valid.

The variable should be categorical

Categorical variables contain a finite, countable number of categories or distinct groups. Categorical data might not have a logical order. For example, categorical variables include gender, material type, and payment method.

The expected counts for each category must not be too small

Each sample should be large enough so that there is a reasonable chance of observing outcomes in every category. If the expected counts are too low, the p-value for the test may not be accurate. Minitab indicates, in your results, whether the expected counts are too low.

If the expected counts for a category is too low, you may be able to combine that category with adjacent categories to achieve the minimum expected count. You should combine categories only when necessary because you lose information when you combine categories.

By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy