# Graphs for Simple Binary Logistic Regression

Find definitions and interpretation guidance for the graphs.

## Binary fitted line plot

The fitted line plot displays the response and predictor data. The plot includes the regression line, which represents the regression equation. You can also choose to display the confidence interval for the fitted values.

### Interpretation

Use the fitted line plot to examine the relationship between the response variable and the predictor variable.

In these results, the equation is written as the probability of a success. The response value of 1 on the y-axis represents a success. The plot shows that the probability of a success decreases as the temperature increases. When the temperatures in the data are near 50, the slope of the line is not very steep, which indicates that the probability decreases slowly as temperature increases. The line is steeper in the middle portion of the temperature data, which indicates that a change in temperature of 1 degree has a larger effect in this range. When the probability of a success approaches zero at the high end of the temperature range, the line flattens again.

If the model fits the data well, then high predicted probabilities show where the event is common. When the temperatures in the data are near 50, the response value of 1 is most common. As the temperature increases, the response value of zero becomes more common.

If you add confidence intervals to the plot, you can use the intervals to assess how precise the estimates of the fitted values are. In the first plot below, the lines for the confidence interval are approximately the same width as the predictor increases. In the second plot, the confidence interval gets wider as the value of the predictor increases. The wide interval is partly due to the small amount of data when the temperature is high.

## Residuals versus fits

The residuals versus fits graph plots the residuals on the y-axis and the fitted values on the x-axis.

### Interpretation

Use the residuals versus fits plot to verify the assumption that the residuals are randomly distributed. Ideally, the points should fall randomly on both sides of 0, with no recognizable patterns in the points.

The residuals versus fits plot is only available when the data are in Event/Trial format.

The patterns in the following table may indicate that the model does not meet the model assumptions.
Pattern What the pattern may indicate
Fanning or uneven spreading of residuals across fitted values An inappropriate link function
Curvilinear A missing higher-order term or an inappropriate link function
A point that is far away from zero An outlier
A point that is far away from the other points in the x-direction An influential point

If the pattern indicates that you should fit the model with a different link function, you should use Binary Fitted Line Plot or Fit Binary Logistic Regression in Minitab Statistical Software.

In this residuals versus fits plot, the data appear to be randomly distributed about zero. There is no evidence that the value of the residual depends on the fitted value.

The following graphs show an outlier and a violation of the assumption that the residuals are constant.
If you identify any patterns or outliers in your residual versus fits plot, consider the following solutions:
Issue Possible solution
Nonconstant variance Consider using different terms in the model, a different link function, or weights. You can use different link functions or weights in Minitab Statistical Software.
An outlier or influential point
1. Verify that the observation is not a measurement error or data-entry error.
2. Consider performing the analysis without this observation to determine how it impacts your results.

## Residuals versus order

The residual versus order plot displays the residuals in the order that the data were collected.

### Interpretation

Use the residuals versus order plot to verify the assumption that the residuals are independent from one another. Independent residuals show no trends or patterns when displayed in time order. Patterns in the points may indicate that residuals near each other may be correlated, and thus, not independent. Ideally, the residuals on the plot should fall randomly around the center line:
If you see a pattern, investigate the cause. The following types of patterns may indicate that the residuals are dependent.

## Normal probability plot of the residuals

The normal plot of the residuals displays the residuals versus their expected values when the distribution is normal.

### Interpretation

Use the normal probability plot of residuals to verify the assumption that the residuals are normally distributed. The normal probability plot of the residuals should approximately follow a straight line.

The following patterns violate the assumption that the residuals are normally distributed.

If you see a nonnormal pattern, use the other residual plots to check for other problems with the model, such as missing terms or a time order effect.

## Histogram of residuals

The histogram of the residuals shows the distribution of the residuals for all observations.

### Interpretation

Use the histogram of the residuals to determine whether the data are skewed or include outliers. The patterns in the following table may indicate that the model does not meet the model assumptions.
Pattern What the pattern may indicate
A long tail in one direction Skewness
A bar that is far away from the other bars An outlier

Because the appearance of a histogram depends on the number of intervals used to group the data, don't use a histogram to assess the normality of the residuals. Instead, use a normal probability plot.

A histogram is most effective when you have approximately 20 or more data points. If the sample is too small, then each bar on the histogram does not contain enough data points to reliably show skewness or outliers.

## Residuals versus the variables

The residual versus variables plot displays the residuals versus another variable. The variable could already be included in your model. Or, the variable may not be in the model, but you suspect it affects the response.

### Interpretation

If you see a non-random pattern in the residuals, it indicates that the variable affects the response in a systematic way. Consider including this variable in an analysis.

By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy