Ordinal logistic regression estimates a coefficient for each term in the model. The coefficients for the terms in the model are the same for each outcome category.
Ordinal logistic regression also estimates a constant coefficient for all but one of the outcome categories. The constant coefficients, in combination with the coefficients for variables, form a set of binary regression equations. The first equation estimates the probability that the first event occurs. The second equation estimates the probability that the first or second events occur. The third equation estimates the probability that the first, second, or third events occur, and so on. Minitab labels these constant coefficients as Const (1), Const (2), Const (3), and so on.
Use the coefficients to examine how the probability of an outcome changes as the predictor variables change. The estimated coefficient for a predictor represents the change in the link function for each unit change in the predictor, while the other predictors in the model are held constant. The relationship between the coefficient and the probability of an outcome depends on several aspects of the analysis, including the link function, the order of the response categories, and the reference levels for categorical predictors that are in the model. Generally, positive coefficients make the first event and the events that are closer to it more likely as the predictor increases. Negative coefficients make the last event and the events closer to it more likely as the predictor increases. An estimated coefficient near 0 implies that the effect of the predictor is small.
For example, an analysis of a patient satisfaction survey examines the relationship between the distance a patient came and how likely the patient is to return. The first event is first in the response information table. In this case, the first event is "Very Likely" and the last event is "Unlikely." The negative coefficient for distance shows that as distance increases, patients are more likely to respond "Unlikely."
Variable | Value | Count |
---|---|---|
Return Appointment | Very Likely | 19 |
Somewhat Likely | 43 | |
Unlikely | 11 | |
Total | 73 |
95% CI | |||||||
---|---|---|---|---|---|---|---|
Predictor | Coef | SE Coef | Z | P | Odds Ratio | Lower | Upper |
Const(1) | -0.505898 | 0.938791 | -0.54 | 0.590 | |||
Const(2) | 2.27788 | 0.985924 | 2.31 | 0.021 | |||
Distance | -0.0470551 | 0.0797374 | -0.59 | 0.555 | 0.95 | 0.82 | 1.12 |
For categorical predictors, the change is from the reference level to the level of the predictor that is in the logistic regression table. Generally, positive coefficients indicate that the first event is more likely at the level of the factor that is in the logistic regression table than at the reference level of the factor. Negative coefficients indicate that the last event is more likely at the level of the factor that is in the logistic regression table than at the reference level of the factor.
For example, an analysis of a patient satisfaction survey examines the relationship between a patient's employment status and how likely the patient is to return. The first event is "Very Likely" and the last event is "Unlikely." The employment status can be "Unemployed" or "Employed." The reference level of the predictor, which is not in the logistic regression table, is "Employed." The negative coefficient with the level "Unemployed" indicates that patients who are unemployed are more likely to respond "Unlikely" than employed patients.
Variable | Value | Count |
---|---|---|
Return Appointment | Very Likely | 19 |
Somewhat Likely | 43 | |
Unlikely | 11 | |
Total | 73 |
95% CI | |||||||
---|---|---|---|---|---|---|---|
Predictor | Coef | SE Coef | Z | P | Odds Ratio | Lower | Upper |
Const(1) | -0.707512 | 0.352815 | -2.01 | 0.045 | |||
Const(2) | 2.12316 | 0.444672 | 4.77 | 0.000 | |||
Employment Status | |||||||
Unemployed | -0.631468 | 0.471078 | -1.34 | 0.180 | 0.53 | 0.21 | 1.34 |
The constant coefficients combine with the terms for predictors to estimate probabilities. Minitab can store these probabilities for observations in the worksheet when you perform the analysis. For more information, go to Store statistics for Ordinal Logistic Regression.
The standard error of the coefficient estimates the variability between coefficient estimates that you would obtain if you took samples from the same population again and again. The calculation assumes that the sample size and the coefficients to estimate would remain the same if you sampled again and again.
Use the standard error of the coefficient to measure the precision of the estimate of the coefficient. The smaller the standard error, the more precise the estimate.
The Z-value is a test statistic that measures the ratio between the coefficient and its standard error.
Minitab uses the Z-value to calculate the p-value, which you use to make a decision about the statistical significance of the terms and the model. The test is accurate when the sample size is large enough that the distribution of the sample coefficients follows a normal distribution.
A Z-value that is sufficiently far from 0 indicates that the coefficient estimate is both large and precise enough to be statistically different from 0. Conversely, a Z-value that is close to 0 indicates that the coefficient estimate is too small or too imprecise to be certain that the term has an effect on the response.
The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis.
The odds ratio compares the odds of two events. The odds of an event are the probability that the event occurs divided by the probability that the event does not occur. Minitab calculates odds ratios when the model uses the logit link function.
Use the odds ratio to understand the effect of a predictor. The interpretation of the odds ratio depends on whether the predictor is categorical or continuous.
Odds ratios that are greater than 1 indicate that the first event and the events closer to the first event are more likely as the predictor increases. Odds ratios that are less than 1 indicate that the last event and the events that are closer to it are more likely as the predictor increases.
For example, an analysis of a patient satisfaction survey examines the relationship between the distance a patient came and how likely the patient is to return. The first event is first in the response information table. In this case, the first event is "Very Likely" and the last event is "Unlikely." The odds ratio of 0.95 for distance shows that as distance increases, patients are more likely to respond "Unlikely." For each additional mile a patient travels, the odds that the patient response with "Very Likely" instead of "Somewhat Likely" or "Unlikely" decrease by about 5%.
Variable | Value | Count |
---|---|---|
Return Appointment | Very Likely | 19 |
Somewhat Likely | 43 | |
Unlikely | 11 | |
Total | 73 |
95% CI | |||||||
---|---|---|---|---|---|---|---|
Predictor | Coef | SE Coef | Z | P | Odds Ratio | Lower | Upper |
Const(1) | -0.505898 | 0.938791 | -0.54 | 0.590 | |||
Const(2) | 2.27788 | 0.985924 | 2.31 | 0.021 | |||
Distance | -0.0470551 | 0.0797374 | -0.59 | 0.555 | 0.95 | 0.82 | 1.12 |
For categorical predictors, the odds ratio compares the odds of the event occurring at two different levels of the predictor. Odds ratios that are greater than 1 indicate that the first event and the events closer to the first event are more likely at the level of the predictor in the logistic regression table than at the reference level of the predictor. Odds ratios that are less than 1 indicate that the last event and the events that are closer to it are more likely at the at the level of the predictor in the logistic regression table than at the reference level.
For example, an analysis of a patient satisfaction survey examines the relationship between a patient's employment status and how likely the patient is to return. The first event is "Very Likely" and the last event is "Unlikely." The employment status can be "Unemployed" or "Employed." The reference level of the predictor, which is not in the logistic regression table, is "Employed." The odds ratio is less than 1, so an employed patient is more likely to respond that they are "Very Likely" to return than an unemployed patient. The odds that an unemployed patient responds with "Very Likely" instead of "Somewhat Likely" or "Unlikely" are 53% of the odds that an employed patient responds with "Very Likely." Also, the odds that an unemployed patient responds with "Very Likely" or "Somewhat Likely" instead of "Unlikely" are 53% of the odds that an employed patient responds with "Very Likely" or "Somewhat Likely."
Variable | Value | Count |
---|---|---|
Return Appointment | Very Likely | 19 |
Somewhat Likely | 43 | |
Unlikely | 11 | |
Total | 73 |
95% CI | |||||||
---|---|---|---|---|---|---|---|
Predictor | Coef | SE Coef | Z | P | Odds Ratio | Lower | Upper |
Const(1) | -0.707512 | 0.352815 | -2.01 | 0.045 | |||
Const(2) | 2.12316 | 0.444672 | 4.77 | 0.000 | |||
Employment Status | |||||||
Unemployed | -0.631468 | 0.471078 | -1.34 | 0.180 | 0.53 | 0.21 | 1.34 |
The odds ratios use the order of the categories, so the ratios do not describe how the odds change for categories that are out of order. For example, the odds ratio does not describe the change in the odds that the patient responds with "Somewhat Likely" instead of "Very Likely" or "Unlikely." To model categories in an arbitrary order, use nominal logistic regression.
These confidence intervals (CI) are ranges of values that are likely to contain the true values of the odds ratios. The calculation of the confidence intervals uses the normal distribution. The confidence interval is accurate if the sample size is large enough that the distribution of the sample odds ratios follow a normal distribution.
Because samples are random, two samples from a population are unlikely to yield identical confidence intervals. However, if you take many random samples, a certain percentage of the resulting confidence intervals contain the unknown population parameter. The percentage of these confidence intervals that contain the parameter is the confidence level of the interval.
Use the confidence interval to assess the estimate of the odds ratio.
For example, with a 95% confidence level, you can be 95% confident that the confidence interval contains the value of the odds ratio for the population. The confidence interval helps you assess the practical significance of your results. Use your specialized knowledge to determine whether the confidence interval includes values that have practical significance for your situation. If the interval is too wide to be useful, consider increasing your sample size.
This test is an overall test that considers all of the coefficients for a categorical predictor simultaneously. The test is for categorical predictors with more than 2 levels.
Use the test to determine whether a categorical predictor with more than 1 coefficient has a statistically significant relationship with the response events. When a categorical predictor has more than 2 levels, the coefficients for individual levels have different p-values. The overall test gives a single answer about whether the predictor is statistically significant.
Minitab maximizes the log-likelihood function to find optimal values of the estimated coefficients.
Use the log-likelihood to compare two models that use the same data to estimate the coefficients. Because the values are negative, the closer to 0 the value is, the better the model fits the data.
The log-likelihood cannot decrease when you add terms to a model. For example, a model with 5 terms has higher log-likelihood than any of the 4-term models you can make with the same terms. Therefore, log-likelihood is most useful when you compare models of the same size. To make decisions about individual terms, you usually look at the p-values for the term in the different logits.
This test is an overall test that considers all of the coefficients for predictors in the model.
Use the test to determine whether at least one of the predictors in the model has a statistically significant association with the response events. Usually, you do not interpret the G statistic or the degrees of freedom (DF). The DF are equal to the number of coefficients for predictors in the model.