Step 1: Determine which terms have the greatest effect on the response
Use a Pareto chart of the standardized effects to compare the
relative magnitude and the statistical significance of main,
square, and interaction effects.
Minitab plots the standardized effects in the decreasing order of
their absolute values. The reference line on the chart indicates
which effects are significant. By default, Minitab uses a
significance level of 0.05 to draw the reference line.
Step 2: Determine which terms have statistically significant effects on the response
To determine whether the association between the response and each
term in the model is statistically significant, compare the p-value
for the term to your significance level to assess the null
hypothesis. The null hypothesis is that the term's coefficient is
equal to zero, which implies that there is no association between
the term and the response. Usually, a significance level (denoted
as α or alpha) of 0.05 works well. A significance level of 0.05
indicates a 5% risk of concluding that an association exists when
there is no actual association.
P-value ≤ α: The association is statistically significant
If the p-value is less than or equal to the significance level, you
can conclude that there is a statistically significant association
between the response variable and the term.
P-value > α: The association is not statistically significant
If the p-value is greater than the significance level, you cannot
conclude that there is a statistically significant association
between the response variable and the term. You may want to refit
the model without the term.
If there are multiple predictors without a statistically
significant association with the response, you can reduce the model
by removing terms one at a time. For more information on removing
terms from the model, go to Model reduction.
If a coefficient is statistically significant, the interpretation depends on the type of term. The interpretations are as follows:
If a coefficient for a factor is significant, you can conclude that the probability of the event is not the same for all levels of the factor.
Interactions among factors
If a coefficient for an interaction term is significant, the
relationship between a factor and the response depends on the other
factors in the term. In this case, you should not interpret the
main effects without considering the interaction effect.
If a coefficient for a squared term is significant, you can
conclude that the relationship between the factor and the response
follows a curved line.
If the coefficient for a covariate is statistically significant, you can conclude that the association between the response and the covariate is statistically significant.
If the coefficient for a block is statistically significant, you can conclude that the link function for the block is different from the average value.
Step 3: Understand the effects of the predictors
Use the odds ratio to understand the effect of a predictor. The interpretation of the odds ratio depends on whether the predictor is categorical or continuous. Minitab calculates odds ratios when the model uses the logit link function.
Odds Ratios for Continuous Predictors
Odds ratios that are greater than 1 indicate that the event is more likely to occur as the predictor increases. Odds ratios that are less than 1 indicate that the event is less likely to occur as the predictor increases.
Odds Ratios for Categorical Predictors
For categorical predictors, the odds ratio compares the odds of the event occurring at 2 different levels of the predictor. Minitab sets up the comparison by listing the levels in 2 columns, Level A and Level B. Level B is the reference level for the factor. Odds ratios that are greater than 1 indicate that the event is more likely at level A. Odds ratios that are less than 1 indicate that the event is less likely at level A. For information on coding categorical predictors, go to Coding schemes for categorical predictors.
Step 4: Determine how well the model fits your data
To determine how well the model fits your data, examine the goodness-of-fit statistics in the Model Summary table.
Many of the model summary and goodness-of-fit statistics are
affected by how the data are arranged in the worksheet and whether
there is one trial per row or multiple trials per row. The
Hosmer-Lemeshow test is unaffected by how the data are arranged and
is comparable between one trial per row and multiple trials per
row. For more information, go to
How data formats affect goodness-of-fit in binary logistic regression.
The higher the deviance R2 value, the better the model fits your data. Deviance R2 is always between 0% and 100%.
Deviance R2 always increases when you add additional terms to a model. For example, the best five-term model will always have a deviance R2 that is at least as high as the best four-predictor model. Therefore, deviance R2 is most useful when you compare models of the same size.
The data arrangement affects the deviance R2
value. The deviance R
2 is usually higher for data with multiple
trials per row than for data with a single trial per row. Deviance
values are comparable only between
models that use the same data format.
Goodness-of-fit statistics are just one measure of how well the
model fits the data. Even when a model has a desirable value, you
should check the residual plots and goodness-of-fit tests to assess
how well a model fits the data.
Deviance R-sq (adj)
Use adjusted deviance R2 to compare models that have different numbers of terms. Deviance R2 always increases when you add a term to the model. The adjusted deviance R2 value incorporates the number of terms in the model to help you choose the correct model.
AIC, AICc and BIC
Use AIC, AICc, and BIC to compare different models. For each
statistic, smaller values are desirable. However, the model with
the smallest value for a set of predictors does not necessarily fit
the data well. Also use goodness-of-fit tests and residual plots to
assess how well a model fits the data.
Step 5: Determine whether your model does not fit the data
Use the goodness-of-fit tests to determine whether the predicted probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict. If the p-value for the goodness-of-fit test is lower than your chosen significance level, the predicted probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict. This list provides common reasons for the deviation:
Incorrect link function
Omitted higher-order term for variables in the model
Omitted predictor that is not in the model
If the deviation is statistically significant, you can try a different link function or change the terms in the model.
The following statistics test goodness-of-fit. The Deviance and Pearson statistics are affected by how the data are arranged in the worksheet and whether there is one trial per row or multiple trials per row.
Deviance: The p-value for the deviance test tends to be lower for data that are have a single trial per row arrangement compared to data that have multiple trials per row, and generally decreases as the number of trials per row decreases. For data with single trials per row, the Hosmer-Lemeshow results are more trustworthy.
Pearson: The approximation to the chi-square distribution that the Pearson test uses is inaccurate when the expected number of events per row in the data is small. Thus, the Pearson goodness-of-fit test is inaccurate when the data are in the single trial per row format.
Hosmer-Lemeshow: The Hosmer-Lemeshow test does not depend on the number of trials per row in the data as the other goodness-of-fit tests do.When the data have few trials per row, the Hosmer-Lemeshow test is a more trustworthy indicator of how well the model fits the data.