Example of Fit Binary Logistic Model

A marketing consultant for a cereal company investigates the effectiveness of a TV advertisement for a new cereal product. The consultant shows the advertisement in a specific community for one week. Then the consultant randomly samples adults as they leave a local supermarket to ask whether they saw the advertisements and bought the new cereal. The consultant also asks adults whether they had children and what their annual household income is.

Because the response is binary, the consultant uses binary logistic regression to determine how the advertisement, having children, and annual household income are related to whether or not the adults sampled bought the cereal.

  1. Open the sample data, CerealPurchase.MTW.
  2. Choose Stat > Regression > Binary Logistic Regression > Fit Binary Logistic Model.
  3. From the drop-down list, select Response in binary response/frequency format.
  4. In Response, enter Bought.
  5. In Continuous predictors, enter Income.
  6. In Categorical predictors, enter Children ViewAd.
  7. Click Options. Under Confidence level for all intervals, enter 90.
  8. Click OK in each dialog box.

Interpret the results

The Analysis of variance table shows which predictors have a statistically significant relationship with the response. The consultant uses a 0.10 significance level and the results indicate that the predictors Children and ViewAd have a statistically significant relationship with the response. Income does not have a statistically significant relationship with the response because the p-value is greater than 0.10. The consultant may want to refit the model without the income variable.

The odds ratio indicates that adults with children are approximately 4.2 times more likely to purchase the cereal than adults without children. The odds ratio for adults that saw the ad indicates that they are 2.8 times more likely to purchase the cereal than adults who have not seen the ad.

The goodness-of-fit tests are all greater than the significance level of 0.05, which indicates that there is not enough evidence to conclude that the model does not fit the data. The R2 value indicates that the model explains approximately 12.7% of the deviance in the response.

Binary Logistic Regression: Bought versus Income, Children, ViewAd

Method Link function Logit Categorical predictor coding (1, 0) Rows used 71
Response Information Variable Value Count Bought 1 22 (Event) 0 49 Total 71
Regression Equation P(1) = exp(Y')/(1 + exp(Y')) Children ViewAd No No Y' = -3.016 + 0.01374 Income No Yes Y' = -1.982 + 0.01374 Income Yes No Y' = -1.583 + 0.01374 Income Yes Yes Y' = -0.5490 + 0.01374 Income
Coefficients Term Coef SE Coef Z-Value P-Value VIF Constant -3.016 0.939 -3.21 0.001 Income 0.0137 0.0195 0.71 0.481 1.15 Children Yes 1.433 0.856 1.67 0.094 1.12 ViewAd Yes 1.034 0.572 1.81 0.070 1.03
Odds Ratios for Continuous Predictors Odds Ratio 90% CI Income 1.0138 (0.9819, 1.0469)
Odds Ratios for Categorical Predictors Level A Level B Odds Ratio 90% CI Children Yes No 4.1902 (1.0245, 17.1386) ViewAd Yes No 2.8128 (1.0982, 7.2044) Odds ratio for level A relative to level B
Model Summary Deviance Deviance Area Under R-Sq R-Sq(adj) AIC AICc BIC ROC Curve 12.66% 9.25% 84.77 85.37 93.82 0.7333
Goodness-of-Fit Tests Test DF Chi-Square P-Value Deviance 67 76.77 0.194 Pearson 67 76.11 0.209 Hosmer-Lemeshow 8 5.58 0.694
Analysis of Variance Wald Test Source DF Chi-Square P-Value Regression 3 8.79 0.032 Income 1 0.50 0.481 Children 1 2.80 0.094 ViewAd 1 3.27 0.070
Fits and Diagnostics for Unusual Observations Observed Obs Probability Fit Resid Std Resid 50 1.000 0.062 2.357 2.40 R 68 1.000 0.091 2.189 2.28 R R Large residual