A regression coefficient describes the size and direction of the relationship between a predictor and the response variable. Coefficients are the numbers by which the values of the term are multiplied in a regression equation.
Use the coefficient to determine whether a change in a predictor variable makes the event more likely or less likely. The estimated coefficient for a predictor represents the change in the link function for each unit change in the predictor, while the other predictors in the model are held constant. The relationship between the coefficient and the probability depends on several aspects of the analysis, including the link function, the reference event for the response, and the reference levels for categorical predictors that are in the model. Generally, positive coefficients make the event more likely and negative coefficients make the event less likely. An estimated coefficient near 0 implies that the effect of the predictor is small.
The logit link provides the most natural interpretation of the estimated coefficients and is therefore the default link in Minitab. The interpretation uses the fact that the odds of a reference event are P(event)/P(not event) and assumes that the other predictors remain constant. The greater the log odds, the more likely the reference event is. Therefore, positive coefficients indicate that the event becomes more likely and negative coefficients indicate that the event becomes less likely. A summary of interpretations for different types of predictors follows.
The coefficient of a continuous predictor is the estimated change in the natural log of the odds for the reference event for each unit increase in the predictor. For example, if the coefficient for time in seconds is 1.4, then the natural log of the odds increase by 1.4 for each additional second.
Estimated coefficients can also be used to calculate the odds ratios, or the ratio between two odds. To calculate the odds ratio, exponentiate the coefficient for a predictor. The result is the odds ratio for when the predictor is x+1, compared to when the predictor is x. For example, if the odds ratio for mass in kilograms is 0.95, then for each additional kilogram, the probability of the event decreases by about 5%.
The standard error of the coefficient estimates the variability between coefficient estimates that you would obtain if you took samples from the same population again and again. The calculation assumes that the sample size and the coefficients to estimate would remain the same if you sampled again and again.
Use the standard error of the coefficient to measure the precision of the estimate of the coefficient. The smaller the standard error, the more precise the estimate.
The variance inflation factor (VIF) indicates how much the variance of a coefficient is inflated due to multicollinearity.
Use the VIF to describe how much multicollinearity exists in a regression analysis. Multicollinearity is problematic because it can increase the variance of the regression coefficients, making it difficult to evaluate the individual impact that each of the predictors has on the response.
VIF | Multicollinearity |
---|---|
VIF = 1 | None |
1 < VIF < 5 | Moderate |
VIF > 5 | High |
For more information on multicollinearity and how to mitigate the effects of multicollinearity, see Multicollinearity in regression.
For binary logistic regression, Minitab shows two types of regression equations. The first equation relates the probability of the event to the transformed response. The form of the first equation depends on the link function. The second equation relates the predictors to the transformed response.
Use the equations to examine the relationship between the response and the predictor variables.
For example, a model uses the dose of a medicine to predict the event that a type of bacteria is not present in a patient. The first equation shows the relationship between the probability and the transformed response because of the logit link function. The second equation shows how dose relates to the transformed response. Because the coefficient for dose is positive, when the dose is higher, the bacteria is less likely to be present.
P(No Bacteria) | = | exp(Y')/(1 + exp(Y')) |
---|
Y' | = | -5.25 + 3.63 Dose (mg) |
---|
Odds Ratio | 95% CI | |
---|---|---|
Dose (mg) | 37.5511 | (2.9647, 475.6190) |
The odds ratio compares the odds of two events. The odds of an event are the probability that the event occurs divided by the probability that the event does not occur. Minitab calculates odds ratios when the model uses the logit link function.
Use the odds ratio to understand the effect of a predictor. Odds ratios that are greater than 1 indicate that the event is more likely to occur as the predictor increases. Odds ratios that are less than 1 indicate that the event is less likely to occur as the predictor increases.
In these results, the model uses the dosage level of a medicine to predict the presence or absence of bacteria in adults. Each pill contains a 0.5 mg dose, so the researchers use a unit change of 0.5 mg. The odds ratio is approximately 6. For each additional pill that an adult takes, the odds that a patient does not have the bacteria increase by about 6 times.
Unit of Change | Odds Ratio | 95% CI | |
---|---|---|---|
Dose (mg) | 0.5 | 6.1279 | (1.7218, 21.8087) |