All statistics for Predict

Regression equation

Use the regression equation to describe the relationship between the response and the terms in the model. The regression equation is an algebraic representation of the regression line. The regression equation for the linear model takes the following form: Y= b0 + b1x1. In the regression equation, Y is the response variable, b0 is the constant or intercept, b1 is the estimated coefficient for the linear term (also known as the slope of the line), and x1 is the value of the term.

The regression equation with more than one term takes the following form:

y = b0 + b1X1 + b2X2 + ... + bkXk

In the regression equation, the letters represent the following:
  • y is the response variable
  • b0 is the constant
  • b1, b2, ..., bk are the coefficients
  • X1, X2, ..., Xk are the values of the term. Each term can be a single predictor, a polynomial term, or an interaction term.

Minitab uses the equation and the variable settings to calculate the fit.

Variable settings

Minitab uses the regression equation and the variable settings to calculate the fit. If the variable settings are unusual compared to the data that was used to estimate the model, a warning is displayed below the prediction.

Use the variable settings table to verify that you performed the analysis as you intended.

Fit

Fitted values are also called fits or . The fitted values are point estimates of the mean response for given values of the predictors. The values of the predictors are also called x-values. Minitab uses the regression equation and the variable settings to calculate the fit.

The type of fitted values that Minitab displays depends on the type of response variable in your model. For instance, Minitab displays means, probabilities, or standard deviations depending on whether you have continuous or count measurements, binary data, or models that use Analyze Variability.

Interpretation

Fitted values are calculated by entering x-values into the model equation for a response variable.

For example, if the equation is y = 5 + 10x, the fitted value for the x-value, 2, is 25 (25 = 5 + 10(2)).

SE Fit

The standard error of the fit (SE fit) estimates the variation in the estimated mean response for the specified variable settings. The calculation of the confidence interval for the mean response uses the standard error of the fit. Standard errors are always non-negative.

Interpretation

Use the standard error of the fit to measure the precision of the estimate of the mean response. The smaller the standard error, the more precise the predicted mean response. For example, an analyst develops a model to predict delivery time. For one set of variable settings, the model predicts a mean delivery time of 3.80 days. The standard error of the fit for these settings is 0.08 days. For a second set of variable settings, the model produces the same mean delivery time with a standard error of the fit of 0.02 days. The analyst can be more confident that the mean delivery time for the second set of variable settings is close to 3.80 days.

With the fitted value, you can use the standard error of the fit to create a confidence interval for the mean response. For example, depending on the number of degrees of freedom, a 95% confidence interval extends approximately two standard errors above and below the predicted mean. For the delivery times, the 95% confidence interval for the predicted mean of 3.80 days when the standard error is 0.08 is (3.64, 3.96) days. You can be 95% confident that the population mean is within this range. When the standard error is 0.02, the 95% confidence interval is (3.76, 3.84) days. The confidence interval for the second set of variable settings is narrower because the standard error is smaller.

95% CI

The confidence interval for the fit provides a range of likely values for the mean response given the specified settings of the predictors.

Interpretation

Use the confidence interval to assess the estimate of the fitted value for the observed values of the variables.

For example, with a 95% confidence level, you can be 95% confident that the confidence interval contains the population mean for the specified values of the variables in the model. The confidence interval helps you assess the practical significance of your results. Use your specialized knowledge to determine whether the confidence interval includes values that have practical significance for your situation. A wide confidence interval indicates that you can be less confident about the mean of future values. If the interval is too wide to be useful, consider increasing your sample size.

95% PI

The prediction interval is a range that is likely to contain a single future response for a selected combination of variable settings.

Interpretation

Use the prediction intervals (PI) to assess the precision of the predictions. The prediction intervals help you assess the practical significance of your results. If a prediction interval extends outside of acceptable boundaries, the predictions might not be sufficiently precise for your requirements.

With a 95% PI, you can be 95% confident that a single response will be contained in the interval given the settings of the predictors that you specified. The prediction interval is always wider than the confidence interval because of the added uncertainty involved in predicting a single response versus the mean response.

For example, a materials engineer at a furniture manufacturer develops a simple regression model to predict the stiffness of particleboard from the density of the board. The engineer verifies that the model meets the assumptions of the analysis. Then, the analyst uses the model to predict the stiffness.

The regression equation predicts that the stiffness for a new observation with a density of 25 is -21.53 + 3.541*25, or 66.995. Although such an observation is unlikely to have a stiffness of exactly 66.995, the prediction interval indicates that the engineer can be 95% confident that the actual value will be between approximately 48 and 86.