Goodness-of-fit tests for Analyze Binary Response for Factorial Design

Deviance Goodness-of-Fit Test

The deviance goodness-of-fit test assesses the discrepancy between the current model and the full model.

Interpretation

Use the goodness-of-fit tests to determine whether the predicted probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict. If the p-value for the goodness-of-fit test is lower than your chosen significance level, the predicted probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict. This list provides common reasons for the deviation:

Incorrect link function
Omitted higher-order term for variables in the model
Omitted predictor that is not in the model
Overdispersion

If the deviation is statistically significant, you can try a different link function or change the terms in the model.

Many of the goodness-of-fit statistics are affected by how the data are arranged in the worksheet and whether there is one trial per row or multiple trials per row. The p-value for the deviance test tends to be lower for data that are have a single trial per row arrangement compared to data that have multiple trials per row, and generally decreases as the number of trials per row decreases.

The Hosmer-Lemeshow test does not depend on the format of the data. When the data have few trials per row, the Hosmer-Lemeshow test is a more trustworthy indicator of how well the model fits the data. For more information, go to How data formats affect goodness-of-fit in binary logistic regression.

Pearson Goodness-of-Fit Test

The Pearson goodness-of-fit test assesses the discrepancy between the current model and the full model.

Interpretation

Use the goodness-of-fit tests to determine whether the predicted probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict. If the p-value for the goodness-of-fit test is lower than your chosen significance level, the predicted probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict. This list provides common reasons for the deviation:

Incorrect link function
Omitted higher-order term for variables in the model
Omitted predictor that is not in the model
Overdispersion

If the deviation is statistically significant, you can try a different link function or change the terms in the model.

Many of the goodness-of-fit statistics are affected by how the data are arranged in the worksheet and whether there is one trial per row or multiple trials per row. The approximation to the chi-square distribution that the Pearson test uses is inaccurate when the expected number of events per row in the data is small. Thus, the Pearson goodness-of-fit test is inaccurate when the data are in the single trial per row format.

The Hosmer-Lemeshow test does not depend on the format of the data. When the data have few trials per row, the Hosmer-Lemeshow test is a more trustworthy indicator of how well the model fits the data. For more information, go to How data formats affect goodness-of-fit in binary logistic regression.

Hosmer-Lemeshow

The Hosmer-Lemeshow goodness-of-fit test compares the observed and expected frequencies of events and non-events to assess how well the model fits the data.

Interpretation

Use the goodness-of-fit tests to determine whether the predicted probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict. If the p-value for the goodness-of-fit test is lower than your chosen significance level, the predicted probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict. This list provides common reasons for the deviation:

Incorrect link function
Omitted higher-order term for variables in the model
Omitted predictor that is not in the model
Overdispersion

If the deviation is statistically significant, you can try a different link function or change the terms in the model.

The Hosmer-Lemeshow test does not depend on the number of trials per row in the data as the other goodness-of-fit tests do. When the data have few trials per row, the Hosmer-Lemeshow test is a more trustworthy indicator of how well the model fits the data.

Observed and expected frequencies for Hosmer-Lemeshow test

The model predicts the expected frequencies for the Hosmer-Lemeshow test.

Interpretation

Use the observed and expected frequencies for the Hosmer-Lemeshow test to describe how well the model fits the data or to look for areas of poor fit.

For example, the model with the term X produces goodness-of-fit tests with small p-values, which indicates that the model fits the data poorly. In the table of observed and expected frequencies, the expected values were different by more than 10 events for all of the groups except for group 4, when the probability of the event is between 0.32 and 0.325.

When the model includes X and X*X, the goodness-of-fit tests have large p-values. The data do not provide evidence that the estimated probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict. The largest difference between the observed and expected number of events is in group 4. This difference is approximately 7.

Model with X

Coefficients

Term	Coef	SE Coef	Z-Value	P-Value	VIF
Constant	-0.800	0.167	-4.79	0.000
X	0.00092	0.00271	0.34	0.735	1.00