# Simple Regression

## Summary

Provides a mathematical method for establishing the best fit, straight-line equation relating a process output Y to a process input X. Simple regression allows you to predict the value of the output Y for any value of the input X.

• Does a linear relationship exist between two variables (usually a process output Y and a process input X)?
• What is the equation (y = f(X)) for the relationship between Y and X?
• How much of the variation in the output Y can be explained by varying the input X?
• What value of the process input X results in the optimal process output Y?
When to Use Purpose
Start of project Can be very useful for comparing a proposed or existing gage to a highly qualified device (such as a certified lab). A high r-squared value provides reasonable certainty that the test gage matches the reference gage.
Start of project Assists in developing alternative measurement systems in cases when a variable is difficult or expensive to measure; highly correlated and logically linked alternative variables can be used as substitute variables.
Mid-project Investigate the relationship between a process input and the process output, either keeping the input as a potential leverage variable or setting it aside as most likely not important.
Mid-project Find the optimal setting of the input variable.
End of project If used earlier as part of the validation of the measurement system, it should be reapplied to the improved process to again validate the measurement system.

### Data

Continuous Y, numeric X

## How-To

1. Verify the measurement systems for the Y data and the input X are adequate.
2. Develop a data collection strategy (who should collect the data, as well as where and when; how many data values are needed; the preciseness of the data; how to record the data, and so on).
3. Enter the Y data into a single column. These are the response data.
4. In a second column, enter the input (X) data. These are the predictor data.
5. In Minitab, choose Stat > Regression > Regression > > or Stat > Regression > Fitted Line Plot (discussed separately).

## Guidelines

• Take samples across the entire inference space.
• Do not extrapolate; do not use the equation to predict Y values outside the range of sampled X's.
• Check for possible outliers in the unusual observations table (Session Window output).
• The residuals must be independent, be reasonably normal, and have reasonably equal variances. Simple regression is quite robust to nonnormality. For simple regression, the residuals are usually analyzed by a histogram, normal probability plot, residuals versus fits, and residuals versus order, which can be run at one time using the Four in one option.
• When evaluating only two variables, note that the Fitted Line Plot is an option because it provides both the graphical output (from which you can easily identify outliers) and the ability to quickly model quadratic and cubic relationships.
• If you have discrete numeric data from which you can obtain every equally spaced value and you have measured at least 10 possible values, you can evaluate these data as if they are continuous.
By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy