The output also identifies which level of the response is the reference event.
Use the response information to examine how much data are in the analysis. Larger random samples with many occurrences of each level usually provide more accurate inferences about the population.
Also use the response information to determine which event is the reference event. Interpretation of statistics like coefficients and odds ratios depend on which event is the reference event.
The factor information table displays the factors in the design, the numbers of levels, and the values of the levels. Factors can assume only a limited number of possible values, known as factor levels. Factor levels can be text or numeric. Numeric factors use a few controlled values in the experiment, even though many values are possible.
Use the factor information table to see the number of levels in the analysis. For example, a quality analyst plans to study factors that could affect plastic strength during the manufacturing process. The analyst includes Additive. Additive is a categorical variable which can be type A or type B.
Factor | Levels | Values |
---|---|---|
Additive | 2 | A, B |
Factors can be crossed or nested. Two factors are crossed when each level of one factor occurs in combination with each level of the other factor. Two factors are nested when a set of the levels for one factor appear at only one level of a second factor. For example, if a design contains machine and operator, these factors are crossed if all operators use all machines. However, operator is nested in machine if each machine has a different set of operators.
In the factor information table, parentheses indicate nested factors. For example, Standard(Appraiser) indicates that Standard is nested within Appraiser. In this context, the nesting indicates that each appraiser has their own set of standard parts. The factor levels of a nested factor are repeated for each level of nesting, which increases the number of levels for the nested factor. In this example, each appraiser has 5 standards, but because standard is nested in appraiser, standard has 20 different levels.
Factor | Levels | Values |
---|---|---|
Standard(Appraiser) | 20 | 1(Amanda), 2(Amanda), 3(Amanda), 4(Amanda), 5(Amanda), 1(Britt), 2(Britt), 3(Britt), 4(Britt), 5(Britt), 1(Eric), 2(Eric), 3(Eric), 4(Eric), 5(Eric), 1(Mike), 2(Mike), 3(Mike), 4(Mike), 5(Mike) |
Appraiser | 4 | Amanda, Britt, Eric, Mike |
For more information on factors, go to Factors and factor levels, What are factors, crossed factors, and nested factors?, and What is the difference between fixed and random factors?.
The nominal logistic equation treats each nominal outcome separately. The logistic regression equation is comprised of multiple logit functions, one for each value of the response minus one. Each equation has a unique slope for the predictors. These equations evaluate how the probability of one nominal outcome changes relative to another nominal outcome as the predictor variables change.
Use the coefficients to examine how the probability of an outcome changes as the predictor variables change. The estimated coefficient for a predictor represents the change in the link function for each unit change in the predictor, while the other predictors in the model are held constant. The relationship between the coefficient and the probability of an outcome depends on several aspects of the analysis, including the reference outcome for the response variable and the reference levels for categorical predictors. Generally, positive coefficients make the reference outcome less likely as the predictor increases. Negative coefficients make the reference outcome more likely as the predictor increases. An estimated coefficient near 0 implies that the effect of the predictor is small.
For example, a school administrator wants to assess different teaching methods. She uses age and teaching method to predict which subjects students prefer. The first outcome event is first in the response information table and is the reference outcome for the response variable. For this data, the reference outcome is that the student prefers science. Logit 1 compares the probability a student prefers math to science. In this equation, the p-value for the coefficient for age is greater than 0.7. Such a high p-value suggests that age has little effect on a whether a student prefers math to science.
Logit 2 compares arts to science. In this equation, the coefficient for age is larger than the coefficient that compares math to science. The coefficient for age is positive. As students get older, students are more likely to prefer arts to science.
The interpretation of the coefficients for categorical predictors depends on the reference level for the factor. In the teaching methods data, the two levels for teaching method are "Demonstrate" and "Explain." "Demonstrate" is not in the coefficients table, so "Demonstrate" is the reference level. The p-value for "Explain" in the equation that compares math to science is greater than 0.5. Such a high p-value suggests that the teaching method has little effect on whether a student prefers math to science.
In Logit 2, the coefficient for "Explain" is larger than the coefficient that compares math to science. The p-value for this coefficient is less than 0.05, so this coefficient is statistically significant at the 0.05 level. The coefficient for "Explain" in this equation is positive. When the teaching method is "Explain," the student is more likely to prefer art.
Variable | Value | Count | |
---|---|---|---|
Subject | Science | 10 | (Reference Event) |
Math | 11 | ||
Arts | 9 | ||
Total | 30 |
Factor | Levels | Values |
---|---|---|
Teaching Method | 2 | Demonstrate, Explain |
95% CI | |||||||
---|---|---|---|---|---|---|---|
Predictor | Coef | SE Coef | Z | P | Odds Ratio | Lower | Upper |
Logit 1: (Math/Science) | |||||||
Constant | -1.12266 | 4.56425 | -0.25 | 0.806 | |||
Teaching Method | |||||||
Explain | -0.563115 | 0.937591 | -0.60 | 0.548 | 0.57 | 0.09 | 3.58 |
Age | 0.124674 | 0.401079 | 0.31 | 0.756 | 1.13 | 0.52 | 2.49 |
Logit 2: (Arts/Science) | |||||||
Constant | -13.8485 | 7.24256 | -1.91 | 0.056 | |||
Teaching Method | |||||||
Explain | 2.76992 | 1.37209 | 2.02 | 0.044 | 15.96 | 1.08 | 234.90 |
Age | 1.01354 | 0.584494 | 1.73 | 0.083 | 2.76 | 0.88 | 8.66 |
DF | G | P-Value |
---|---|---|
4 | 12.825 | 0.012 |
Method | Chi-Square | DF | P |
---|---|---|---|
Pearson | 6.95295 | 10 | 0.730 |
Deviance | 7.88622 | 10 | 0.640 |
The standard error of the coefficient estimates the variability between coefficient estimates that you would obtain if you took samples from the same population again and again. The calculation assumes that the sample size and the coefficients to estimate would remain the same if you sampled again and again.
Use the standard error of the coefficient to measure the precision of the estimate of the coefficient. The smaller the standard error, the more precise the estimate.
The Z-value is a test statistic that measures the ratio between the coefficient and its standard error.
Minitab uses the Z-value to calculate the p-value, which you use to make a decision about the statistical significance of the terms and the model. The test is accurate when the sample size is large enough that the distribution of the sample coefficients follows a normal distribution.
A Z-value that is sufficiently far from 0 indicates that the coefficient estimate is both large and precise enough to be statistically different from 0. Conversely, a Z-value that is close to 0 indicates that the coefficient estimate is too small or too imprecise to be certain that the term has an effect on the response.
The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis.
The odds ratio compares the odds of two outcomes. The odds of an outcome are the probability that the comparison outcome occurs divided by the probability that the reference outcome occurs.
Use the odds ratio to understand the effect of a predictor. The interpretation of the odds ratio depends on whether the predictor is categorical or continuous. In the logistic regression table, the comparison outcome is first outcome after the logit label and the reference outcome is the second outcome. The reference outcome is the same for every logit.
Odds ratios that are greater than 1 indicate that the comparison outcome is more likely than the reference outcome as the predictor increases. Odds ratios that are less than 1 indicate that the reference outcome is more likely than the comparison outcome.
For example, a school administrator wants to assess different teaching methods. For logit 1, the comparison outcome is math. For logit 2, the comparison outcome is arts. The reference outcome is science. In logit 2, the estimate of the odds ratio is 2.76, which is greater than 1. As age increases, a student is more likely to prefer arts to science. For each additional year of age, the odds that a student prefers arts is 3 times greater than the odds that they prefer science.
95% CI | |||||||
---|---|---|---|---|---|---|---|
Predictor | Coef | SE Coef | Z | P | Odds Ratio | Lower | Upper |
Logit 1: (Math/Science) | |||||||
Constant | -1.12266 | 4.56425 | -0.25 | 0.806 | |||
Teaching Method | |||||||
Explain | -0.563115 | 0.937591 | -0.60 | 0.548 | 0.57 | 0.09 | 3.58 |
Age | 0.124674 | 0.401079 | 0.31 | 0.756 | 1.13 | 0.52 | 2.49 |
Logit 2: (Arts/Science) | |||||||
Constant | -13.8485 | 7.24256 | -1.91 | 0.056 | |||
Teaching Method | |||||||
Explain | 2.76992 | 1.37209 | 2.02 | 0.044 | 15.96 | 1.08 | 234.90 |
Age | 1.01354 | 0.584494 | 1.73 | 0.083 | 2.76 | 0.88 | 8.66 |
For categorical predictors, the odds ratio compares the odds of the comparison outcome at two different levels of the predictor. The comparison level is in the logistic regression table and has an estimated odds ratio. Odds ratios that are greater than 1 indicate that the comparison outcome becomes more likely relative to the reference outcome when the categorical predictor changes from the reference level to the comparison level. Odds ratios that are less than 1 indicate that the comparison outcome becomes less likely relative to the reference outcome when the categorical predictor changes from the reference level to the comparison level.
For example, a school administrator wants to assess different teaching methods. For logit 1, the comparison outcome is math. For logit 2, the comparison outcome is arts. The reference outcome is science. For logit 2, the estimate of the odds ratio for teaching method is 15.96, which is greater than 1. When the teaching methods changes from "demonstrate" to "explain," the odds that a student prefers arts is about 16 times greater than the odds that they prefer science.
95% CI | |||||||
---|---|---|---|---|---|---|---|
Predictor | Coef | SE Coef | Z | P | Odds Ratio | Lower | Upper |
Logit 1: (Math/Science) | |||||||
Constant | -1.12266 | 4.56425 | -0.25 | 0.806 | |||
Teaching Method | |||||||
Explain | -0.563115 | 0.937591 | -0.60 | 0.548 | 0.57 | 0.09 | 3.58 |
Age | 0.124674 | 0.401079 | 0.31 | 0.756 | 1.13 | 0.52 | 2.49 |
Logit 2: (Arts/Science) | |||||||
Constant | -13.8485 | 7.24256 | -1.91 | 0.056 | |||
Teaching Method | |||||||
Explain | 2.76992 | 1.37209 | 2.02 | 0.044 | 15.96 | 1.08 | 234.90 |
Age | 1.01354 | 0.584494 | 1.73 | 0.083 | 2.76 | 0.88 | 8.66 |
These confidence intervals (CI) are ranges of values that are likely to contain the true values of the odds ratios. The calculation of the confidence intervals uses the normal distribution. The confidence interval is accurate if the sample size is large enough that the distribution of the sample odds ratios follow a normal distribution.
Because samples are random, two samples from a population are unlikely to yield identical confidence intervals. However, if you take many random samples, a certain percentage of the resulting confidence intervals contain the unknown population parameter. The percentage of these confidence intervals that contain the parameter is the confidence level of the interval.
Use the confidence interval to assess the estimate of the odds ratio.
For example, with a 95% confidence level, you can be 95% confident that the confidence interval contains the value of the odds ratio for the population. The confidence interval helps you assess the practical significance of your results. Use your specialized knowledge to determine whether the confidence interval includes values that have practical significance for your situation. If the interval is too wide to be useful, consider increasing your sample size.
This test is an overall test that considers all of the coefficients for a categorical predictor simultaneously. The test is for categorical predictors with more than 2 levels.
Use the test to determine whether a categorical predictor with more than 1 coefficient has a statistically significant relationship with the response events. When a categorical predictor has more than 2 levels, the coefficients for individual levels have different p-values. The overall test gives a single answer about whether the predictor is statistically significant.
Minitab maximizes the log-likelihood function to find optimal values of the estimated coefficients.
Use the log-likelihood to compare two models that use the same data to estimate the coefficients. Because the values are negative, the closer to 0 the value is, the better the model fits the data.
The log-likelihood cannot decrease when you add terms to a model. For example, a model with 5 terms has higher log-likelihood than any of the 4-term models you can make with the same terms. Therefore, log-likelihood is most useful when you compare models of the same size. To make decisions about individual terms, you usually look at the p-values for the term in the different logits.
This test is an overall test that considers all of the coefficients for predictors in the model.
Use the test to determine whether at least one of the predictors in the model has a statistically significant association with the response events. Usually, you do not interpret the G statistic or the degrees of freedom (DF). The DF are equal to the number of coefficients for predictors in the model.
The Pearson goodness-of-fit test assesses the discrepancy between the current model and the full model.
The deviance goodness-of-fit test assesses the discrepancy between the current model and the full model.