# Binary logistic regression analysis

Use binary logistic regression analysis to evaluate multiple process inputs when the process output is defined as the probability of a desired outcome. The experiment can be either a designed experiment using controlled factors or an uncontrolled experiment. Binary logistic regression analysis turns the discrete responses of good/bad, yes/no, or buy/don't buy into a continuous % likelihood of yes, good, or buy, then uses regression methods to build a predictive model.

• Which process inputs have the largest effects on the odds of a desirable outcome (which inputs are the key inputs)?
• What effect does a change in an input have on the odds of a desirable outcome?
• What settings of the key inputs would result in the optimal process output?
• What is the equation (Y = f(X)) relating the process output to the settings of the inputs?
When to Use Purpose
Mid-project Evaluate the effects of multiple process inputs on a process output, which is defined as the probability of one of two possible outcomes (good/bad, yes/no, or buy/don't buy).
Mid-project Determine which inputs are the key inputs.
Mid-project Build a predictive model using the key inputs.
Mid-project Determine the settings of the key inputs that will result in the optimal process output.

## Data

Your data must be discrete Y at two levels, and categorical or numeric Xs.

## Guidelines

• You should take samples across the entire inference space.
• It is assumed that you are presenting a well fit model that predicts well and has no significant outliers.
• Various goodness-of-fit tests are presented; a high p-value indicates a good model.
• In the measures of association table, a high % of concordance indicates a good fit.
• Two charts evaluate outliers: delta chi-sq versus probability (using a suggested upper limit of 3.84) and delta chi-sq versus leverage (with a limit of the smaller of 0.99 or 3p/n, where p is the number of coefficients including the constant and n is the number of covariate groups).
• Binary logistic regression analysis usually requires relatively large sample sizes. Evaluate the required sample size before creating a binary logistic regression design. The recommended minimum sample size for each possible response (for example, # of fail or # of red) is at least 10 times the number of coefficients in your prediction equation.

## How-to

1. Verify that the measurement systems for the Y data and the process inputs are adequate.
2. Develop a data collection strategy (who should collect the data, as well as where and when; the preciseness of the data; how to record the data, and so on).
3. Enter your response (Y) data into a single column.
4. In additional columns, enter data for each input (X) into the model in the Minitab command. Note: If a variable is discrete or categorical, you must also enter it as a factor.
5. As with all regression analyses, you should reduce the model using p-values to obtain a final reduced model.