Find definitions and interpretation guidance for every statistic in the Coefficients table.

A regression coefficient describes the size and direction of the relationship between a predictor and the response variable. Coefficients are the numbers by which the values of the term are multiplied in a regression equation.

The interpretation of each coefficient depends on whether it is the continuous coefficient for time or the categorical coefficient for batch.

- Time
- The coefficient for the time variable represents the change in the mean response for one unit of change in time. If the coefficient is negative, as time passes, the mean value of the response decreases. If the coefficient is positive, as time passes, the mean value of the response increases.
- Batch
- A coefficient is listed for each level of the batch factor except for one level. The absent level is the reference level for the batch factor. Each coefficient represents the mean difference between that level mean and the mean for the reference level.

In the presence of interactions, the interpretation of the coefficients is complex. In these results, a quality engineer wants to estimate the shelf life of a new drug. The negative coefficient for Batch 1 indicates that the drug in Batch 1 has less potency than the drug in the reference level, which is Batch 6. However, the coefficient for the Month by Batch interaction for Batch 1 is positive. The effect of time depends on the batch, so the difference between Batch 1 and Batch 6 changes over time.

Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constant 100.085 0.143 701.82 0.000
Month -0.13633 0.00769 -17.74 0.000 1.07
Batch
1 -0.232 0.292 -0.80 0.432 3.85
2 0.068 0.292 0.23 0.818 3.85
3 0.394 0.275 1.43 0.162 3.41
4 -0.317 0.292 -1.08 0.287 3.85
5 0.088 0.275 0.32 0.752 *
Month*Batch
1 0.0454 0.0164 2.76 0.010 4.52
2 -0.0241 0.0164 -1.47 0.152 4.52
3 -0.0267 0.0136 -1.96 0.060 3.65
4 0.0014 0.0164 0.08 0.935 4.52
5 0.0040 0.0136 0.30 0.769 *

The size of the coefficient is usually a good way to assess the practical significance of the effect that a term has on the response variable. However, the size of the coefficient does not indicate whether a term is statistically significant because the calculations for significance also consider the variation in the response data. To determine statistical significance, examine the p-value for the term.

The standard error of the coefficient estimates the uncertainty from estimating the coefficients from sample data.

Use the standard error of the coefficient to measure the precision of the estimate of the coefficient. The smaller the standard error, the more precise the estimate. Dividing the coefficient by its standard error calculates a t-value. If the p-value associated with this t-statistic is less than your significance level (denoted as alpha or α), you conclude that the coefficient is statistically significant.

The t-value measures the ratio between the coefficient and its standard error.

Minitab uses the t-value to calculate the p-value, which you use to test whether the coefficient is significantly different from 0.

You can use the t-value to determine whether to reject the null hypothesis. However, the p-value is used more often because the threshold for the rejection of the null hypothesis does not depend on the degrees of freedom. For more information on using the t-value, go to Using the t-value to determine whether to reject the null hypothesis.

The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis.

For a stability study, the coefficients table contains only terms with p-values less than the significance level for the analysis. The null hypothesis is that the term's coefficient is equal to zero. The default significance level is 0.25. A significance level of 0.25 indicates a 25% risk of concluding that an association exists when there is no actual association.

If a model term is statistically significant, the interpretation depends on the type of term:

- If time is significant, then the response changes over time.
- If batch is significant, then the mean response is different in different batches.
- If the time by batch interaction is significant, then how fast the response changes over time depends on the batch.

These confidence intervals (CI) are ranges of values that are likely to contain the true value of the coefficient for each term in the model.

Because samples are random, two samples from a population are unlikely to yield identical confidence intervals. However, if you take many random samples, a certain percentage of the resulting confidence intervals contain the unknown population parameter. The percentage of these confidence intervals that contain the parameter is the confidence level of the interval.

The confidence interval is composed of the following two parts:

- Point estimate
- This single value estimates a population parameter by using your sample data. The confidence interval is centered around the point estimate.
- Margin of error
- The margin of error defines the width of the confidence interval and is determined by the observed variability in the sample, the sample size, and the confidence level. To calculate the upper limit of the confidence interval, the margin of error is added to the point estimate. To calculate the lower limit of the confidence interval, the margin of error is subtracted from the point estimate.

Use the confidence interval to assess the estimate of the population coefficient for each term in the model.

For example, with a 95% confidence level, you can be 95% confident that the confidence interval contains the value of the coefficient for the population. The confidence interval helps you assess the practical significance of your results. Use your specialized knowledge to determine whether the confidence interval includes values that have practical significance for your situation. If the interval is too wide to be useful, consider increasing your sample size.

The variance inflation factor (VIF) indicates how much the variance of a coefficient is inflated due to the correlations among the predictors in the model.

Use the VIF to describe how much multicollinearity (which is correlation between predictors) exists in a regression analysis. Multicollinearity is problematic because it can increase the variance of the regression coefficients, making it difficult to evaluate the individual impact that each of the correlated predictors has on the response.

Use the following guidelines to interpret the VIF:

A VIF value greater than 5 suggests that the regression coefficient is poorly estimated due to severe multicollinearity.

VIF | Status of predictor |
---|---|

VIF = 1 | Not correlated |

1 < VIF < 5 | Moderately correlated |

VIF > 5 | Highly correlated |

For more information on multicollinearity and how to mitigate the effects of multicollinearity, see Multicollinearity in regression.