Binary Logistic Regression

Summary

Provides a method of evaluating multiple process inputs when the process output is defined as the probability of a desired outcome. The experiment can be either a designed experiment using controlled factors or an uncontrolled experiment. Simply put, binary logistic regression turns the discrete responses of good/bad, yes/no, or buy/don't buy into a continuous % likelihood of yes, good, or buy, then uses regression methods to build a predictive model.

Answers the questions:
  • Which process inputs have the largest effects on the odds of a desirable outcome (which inputs are the key inputs)?
  • What effect does a change in an input have on the odds of a desirable outcome?
  • What settings of the key inputs would result in the optimal process output?
  • What is the equation (Y = f(X)) relating the process output to the settings of the inputs?
When to Use Purpose
Mid-project Evaluate the effects of multiple process inputs on a process output, which is defined as the probability of one of two possible outcomes (good/bad, yes/no, or buy/don't buy).
Mid-project Determine which inputs are the key inputs.
Mid-project Build a predictive model using the key inputs.
Mid-project Determine the settings of the key inputs that will result in the optimal process output.

Data

Discrete Y at two levels, categorical or numeric X's

How-To

  1. Verify that the measurement systems for the Y data and the process inputs are adequate.
  2. Develop a data collection strategy (who should collect the data, as well as where and when; the preciseness of the data; how to record the data, and so on).
  3. Enter your Y (response) data into a single column.
  4. In additional columns, enter data for each input (X) into the model in the Minitab command. Note: If a variable is discrete or categorical, you must also enter it as a factor.
  5. As with all regression analyses, you should reduce the model using p-values to obtain a final reduced model.
  6. If you want to look at the predicted event probabilities, in Stat > Regression > Binary Logistic Regression, click Storage and then check Event probability.

Guidelines

  • You should take samples across the entire inference space.
  • It is assumed that you are presenting a well fit model that predicts well and has no significant outliers.
    • Various goodness-of-fit tests are presented; a high p-value indicates a good model.
    • In the measures of association table, a high % of concordance indicates a good fit.
    • Two charts evaluate outliers: Delta Chi-Sq vs. Probability (using a suggested upper limit of 3.84) and Delta Chi-Sq vs. Leverage (with a limit of the smaller of 0.99 or 3p/n, where p is the number of coefficients including the constant and n is the number of covariate groups).
  • Binary logistic regression usually requires relatively large sample sizes. Evaluate the required sample size before creating a binary logistic regression design. The recommended minimum sample size for each possible response (for example, # of fail or # of red) is at least 10 times the number of coefficients in your prediction equation.
By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy