Coefficients for Fit Poisson Model

Find definitions and interpretation guidance for every statistic in the Coefficients table.

In This Topic

Coef
SE Coef
Confidence interval for coefficient (95% CI)
Z-Value
P-Value
VIF
Coded Coefficients
Regression Equation

Coef

A regression coefficient describes the size and direction of the relationship between a predictor and the response variable. Coefficients are the numbers by which the values of the term are multiplied in a regression equation.

Interpretation

Use the coefficient to determine whether a change in a predictor variable makes the event more likely or less likely. The estimated coefficient for a predictor represents the change in the link function for each unit change in the predictor, while the other predictors in the model are held constant. The relationship between the coefficient and the number of events depends on several aspects of the analysis, including the link function and the reference levels for categorical predictors that are in the model. Generally, positive coefficients make the event more likely and negative coefficients make the event less likely. An estimated coefficient near zero implies that the effect of the predictor is small, or nonexistent.

The interpretation of the estimated coefficients for categorical predictors is relative to the reference level of the predictor. Positive coefficients indicate that the event is more likely at that level of the predictor than at the reference level of the factor. Negative coefficients indicate that the event is less likely at that level of the predictor than at the reference level.

SE Coef

The standard error of the coefficient estimates the variability between coefficient estimates that you would obtain if you took samples from the same population again and again. The calculation assumes that the sample size and the coefficients to estimate would remain the same if you sampled again and again.

Interpretation

Use the standard error of the coefficient to measure the precision of the estimate of the coefficient. The smaller the standard error, the more precise the estimate.

Confidence interval for coefficient (95% CI)

These confidence intervals (CI) are ranges of values that are likely to contain the true value of the coefficient for each term in the model. The calculation of the confidence intervals uses the normal distribution. The confidence interval is accurate if the sample size is large enough that the distribution of the sample coefficient follows a normal distribution.

Because samples are random, two samples from a population are unlikely to yield identical confidence intervals. However, if you take many random samples, a certain percentage of the resulting confidence intervals contain the unknown population parameter. The percentage of these confidence intervals that contain the parameter is the confidence level of the interval.

The confidence interval is composed of the following two parts:

Point estimate: This single value estimates a population parameter by using your sample data. The confidence interval is centered around the point estimate.
Margin of error: The margin of error defines the width of the confidence interval and is determined by the observed variability in the sample, the sample size, and the confidence level. To calculate the upper limit of the confidence interval, the margin of error is added to the point estimate. To calculate the lower limit of the confidence interval, the margin of error is subtracted from the point estimate.

Interpretation

Use the confidence interval to assess the estimate of the population coefficient for each term in the model.

For example, with a 95% confidence level, you can be 95% confident that the confidence interval contains the value of the coefficient for the population. The confidence interval helps you assess the practical significance of your results. Use your specialized knowledge to determine whether the confidence interval includes values that have practical significance for your situation. If the interval is too wide to be useful, consider increasing your sample size.

Z-Value

The Z-value is a test statistic for Wald tests that measures the ratio between the coefficient and its standard error.

Interpretation

Minitab uses the Z-value to calculate the p-value, which you use to make a decision about the statistical significance of the terms and the model. The Wald test is accurate when the sample size is large enough that the distribution of the sample coefficients follows a normal distribution.

A Z-value that is sufficiently far from 0 indicates that the coefficient estimate is both large and precise enough to be statistically different from 0. Conversely, a Z-value that is close to 0 indicates that the coefficient estimate is too small or too imprecise to be certain that the term has an effect on the response.

The tests in the Deviance table are likelihood ratio tests. The test in the expanded display of the Coefficients table are Wald approximation tests. The likelihood ratio tests are more accurate for small samples than the Wald approximation tests.

P-Value

The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis.

Interpretation

To determine whether the association between the response and each term in the model is statistically significant, compare the p-value for the term to your significance level to assess the null hypothesis. The null hypothesis is that the term's coefficient is equal to zero, which implies that there is no association between the term and the response. Usually, a significance level (denoted as α or alpha) of 0.05 works well. A significance level of 0.05 indicates a 5% risk of concluding that an association exists when there is no actual association.

P-value ≤ α: The association is statistically significant: If the p-value is less than or equal to the significance level, you can conclude that there is a statistically significant association between the response variable and the term.
P-value > α: The association is not statistically significant: If the p-value is greater than the significance level, you cannot conclude that there is a statistically significant association between the response variable and the term. You may want to refit the model without the term.; If there are multiple predictors without a statistically significant association with the response, you can reduce the model by removing terms one at a time. For more information on removing terms from the model, go to Model reduction.

If a model term is statistically significant, the interpretation depends on the type of term. The interpretations are as follows:

If a continuous predictor is significant, you can conclude that the coefficient for the predictor is different from zero.
If a categorical predictor is significant, the conclusion depends on the coding for the categorical variable. With (0, 1) coding, you can conclude that the mean number of events for that level is different from the mean number of events for the reference level. With (-1, 0, +1) coding, you can conclude that the mean number of events for that level is different from the baseline mean number of events.
you can conclude that not all of the levels have the same mean number of events.
If an interaction term is significant, you can conclude that the relationship between the predictor and the number of events depends on the other predictors in the term.
If a polynomial term is significant, you can conclude that the relationship between a predictor and the number of events depends on the magnitude of the predictor.

VIF

The variance inflation factor (VIF) indicates how much the variance of a coefficient is inflated due to the correlations among the predictors in the model.

Interpretation

Use the VIF to describe how much multicollinearity (which is correlation between predictors) exists in a regression analysis. Multicollinearity is problematic because it can increase the variance of the regression coefficients, making it difficult to evaluate the individual impact that each of the correlated predictors has on the response.

Use the following guidelines to interpret the VIF:

VIF	Status of predictor
VIF = 1	Not correlated
1 < VIF < 5	Moderately correlated
VIF > 5	Highly correlated

A VIF value greater than 5 suggests that the regression coefficient is poorly estimated due to severe multicollinearity.

For more information on multicollinearity and how to mitigate the effects of multicollinearity, see Multicollinearity in regression.

Coded Coefficients

When you standardize the continuous variables, the coefficients represent a one-unit change in the standardized variables. Usually, you standardize the continuous predictors to reduce multicollinearity or to put the variables on a common scale.

Interpretation

How you use the coded coefficients depends on the standardization method. The exact interpretation of the coefficients also depends on aspects of the analysis like the link function. Positive coefficients make the event more likely. Negative coefficients make the event less likely. An estimated coefficient near 0 implies that the effect of the predictor is small.

Specify low and high levels to code as −1 and +1

Each coefficient represents the expected change in the mean of the transformed response given that the predictor changes by 1 unit on the coded scale.

For example, a model uses temperature in degrees Celsius and time in seconds. For temperature, the coding makes 0 correspond to 50 degrees Celsius and 1 correspond to 100 degrees Celsius. For time, the coding makes 0 correspond to 30 seconds and 1 correspond to 60 seconds. The coefficient for temperature represents an increase of 50 degrees Celsius. The coefficient for time represents an increase of 30 seconds.

Subtract the mean, then divide by the standard deviation

Each coefficient represents the expected change in the mean of the transformed response given that the predictor variable changes by 1 standard deviation.

For example, a model uses temperature in degrees Celsius and time in seconds. The standard deviation of temperature is 3.7 degrees Celsius. The standard deviation of time is 18.3 seconds. The coefficient for temperature represents an increase of 3.7 degrees Celsius. The coefficient for time represents an increase of 18.3 seconds.

Subtract the mean

Each coefficient represents the expected change in the mean of the transformed response given that the predictor changes by 1.

For example, a model uses temperature in degrees Celsius and time in seconds. The coefficient for temperature represents an increase of 1 degree Celsius. The coefficient for time represents an increase of 1 second.

Divide by the standard deviation

Each coefficient represents the expected change in the mean of the transformed response given that the predictor variable changes by 1 standard deviation.

Subtract a specified value, then divide by another

Each coefficient represents the expected change in the mean of the transformed response given that the predictor variable changes by the divisor.

Interpretation for the logit link function

The logit link function provides the most natural interpretation of the estimated coefficients and is therefore the default link in Minitab. For the logit link function, the transformed response variable is the natural log of the odds for the event. A summary of the interpretations for the different standardization methods follows.

Specify low and high levels to code as −1 and +1

Each coefficient represents the expected change in the mean of the transformed response given that the predictor changes by 1 unit on the coded scale.

For example, a model uses temperature in degrees Celsius. The coding makes 0 correspond to 50 degrees Celsius and 1 correspond to 100 degrees Celsius. The coefficient for temperature represents an increase of 50 degrees Celsius. The coefficient for temperature is 1.8. When temperature increases by 1 coded unit, temperature increases by 50 degrees and the natural log of the odds increase by 1.8.

Subtract the mean, then divide by the standard deviation

Each coefficient represents the expected change in the natural log for the odds of the event given that the predictor variable changes by 1 standard deviation.

For example, a model uses temperature in degrees Celsius. The standard deviation of temperature is 3.7 degrees Celsius. The coded coefficient for temperature is 1.4. When temperature increases by 1 coded unit, temperature increases 3.7 degrees Celsius and the natural log of the odds increase by 1.4.

Subtract the mean

Each coefficient represents the expected change in the natural log for the odds of the event given that the predictor changes by 1.

For example, a model uses temperature in degrees Celsius. The coefficient for temperature represents an increase of 1 degree Celsius. The coefficient for temperature is 2.3. When temperature increases by 1 coded unit, temperature increases 1 degree Celsius and the natural log of the odds increase by 2.3.

Divide by the standard deviation

Each coefficient represents the expected change in the natural log for the odds of the event given that the predictor variable changes by 1 standard deviation.

For example, a model uses temperature in degrees Celsius. The standard deviation of temperature is 3.7 degrees Celsius. The coefficient for temperature is 1.4. When temperature increases by 1 coded unit, temperature increases 3.7 degrees Celsius and the natural log of the odds increase by 1.4.

Subtract a specified value, then divide by another

Each coefficient represents the expected change in the natural log for the odds of the event given that the predictor variable changes by the divisor.

For example, a model uses length in meters and electric current in amperes. The divisor is 1,000. The coefficient for length represents an increase of 1 millimeter. The coefficient for length is 5.6. When length increases by 1 coded unit, the length increases by 1 millimeter and the natural log of the odds increase by 5.6. The coefficient for electric current represents an increase of 1 milliampere.

Regression Equation

For Poisson regression, Minitab shows two types of regression equations. The first equation relates the number of events to the transformed response. The form of the first equation depends on the link function.

The second equation relates the predictors to the transformed response. If the model contains both continuous and categorical predictors, the second equation can be separated for each combination of categories. For more information on how to choose the number of equations to display, go to Select the results to display for Fit Poisson Model.

Interpretation

Use the equations to examine the relationship between the response and the predictor variables.

For example, a model to predict whether a resin part has a defect contains these terms:

Size of Screw
Temperature

The first equation shows the relationship between the number of events and the transformed response because of the natural log link function.

The second equations show how the size of the screw and the temperature are related to the transformed response. When the size of the screw is large, the coefficient for temperature is about −0.003. When the size of the screw is small, the coefficient is about −0.0005. For these equations, the higher the temperature the fewer defects occur. However, temperature has a stronger effect on the number of defects when the size of the screw is large.

Regression Equation

Discoloration Defects	=	exp(Y')

Size of Screw
large	Y'	=	4.649 - 0.003285 Temperature

small	Y'	=	4.105 - 0.000481 Temperature

If your model is nonhierarchical and you standardized the continuous predictors, the regression equation is in coded units. For more information, see the section on Coded Coefficients. For more information about hierarchy, go to What are hierarchical models?.