Use binary logistic regression analysis to evaluate multiple process inputs when the process output is defined as the probability of a desired outcome. The experiment can be either a designed experiment using controlled factors or an uncontrolled experiment. Binary logistic regression analysis turns the discrete responses of good/bad, yes/no, or buy/don't buy into a continuous % likelihood of yes, good, or buy, then uses regression methods to build a predictive model.

Answers the questions:

- Which process inputs have the largest effects on the odds of a desirable outcome (which inputs are the key inputs)?
- What effect does a change in an input have on the odds of a desirable outcome?
- What settings of the key inputs would result in the optimal process output?
- What is the equation (Y = f(X)) relating the process output to the settings of the inputs?

When to Use | Purpose |
---|---|

Mid-project | Evaluate the effects of multiple process inputs on a process output, which is defined as the probability of one of two possible outcomes (good/bad, yes/no, or buy/don't buy). |

Mid-project | Determine which inputs are the key inputs. |

Mid-project | Build a predictive model using the key inputs. |

Mid-project | Determine the settings of the key inputs that will result in the optimal process output. |

Your data must be discrete Y at two levels, and categorical or numeric Xs.

- You should take samples across the entire inference space.
- It is assumed that you are presenting a well fit model that predicts well and has no significant outliers.
- Various goodness-of-fit tests are presented; a high p-value indicates a good model.
- In the measures of association table, a high % of concordance indicates a good fit.
- Two charts evaluate outliers: delta chi-sq versus probability (using a suggested upper limit of 3.84) and delta chi-sq versus leverage (with a limit of the smaller of 0.99 or 3p/n, where p is the number of coefficients including the constant and n is the number of covariate groups).

- Binary logistic regression analysis usually requires relatively large sample sizes. Evaluate the required sample size before creating a binary logistic regression design. The recommended minimum sample size for each possible response (for example, # of fail or # of red) is at least 10 times the number of coefficients in your prediction equation.

- Verify that the measurement systems for the Y data and the process inputs are adequate.
- Develop a data collection strategy (who should collect the data, as well as where and when; the preciseness of the data; how to record the data, and so on).
- Enter your response (Y) data into a single column.
- In additional columns, enter data for each input (X) into the model in the Minitab command. Note: If a variable is discrete or categorical, you must also enter it as a factor.
- As with all regression analyses, you should reduce the model using p-values to obtain a final reduced model.

For more information, go to Insert an analysis capture tool.