Example of Fit Binary Logistic Model

A marketing consultant for a cereal company investigates the effectiveness of a TV advertisement for a new cereal product. The consultant shows the advertisement in a specific community for one week. Then the consultant randomly samples adults as they leave a local supermarket to ask whether they saw the advertisements and bought the new cereal. The consultant also asks adults whether they had children and what their annual household income is.

Because the response is binary, the consultant uses binary logistic regression to determine how the advertisement, having children, and annual household income are related to whether or not the adults sampled bought the cereal.

  1. Open the sample data, CerealPurchase.MTW.
  2. Choose Stat > Regression > Binary Logistic Regression > Fit Binary Logistic Model.
  3. From the drop-down list, select Response in binary response/frequency format.
  4. In Response, enter Bought.
  5. In Continuous predictors, enter Income.
  6. In Categorical predictors, enter Children ViewAd.
  7. Click Options. Under Confidence level for all intervals, enter 90.
  8. Click OK in each dialog box.

Interpret the results

The Analysis of variance table shows which predictors have a statistically significant relationship with the response. The consultant uses a 0.10 significance level and the results indicate that the predictors Children and ViewAd have a statistically significant relationship with the response. Income does not have a statistically significant relationship with the response because the p-value is greater than 0.10. The consultant may want to refit the model without the income variable.

The odds ratio indicates that adults with children are approximately 4.2 times more likely to purchase the cereal than adults without children. The odds ratio for adults that saw the ad indicates that they are 2.8 times more likely to purchase the cereal than adults who have not seen the ad.

The goodness-of-fit tests are all greater than the significance level of 0.05, which indicates that there is not enough evidence to conclude that the model does not fit the data. The R2 value indicates that the model explains approximately 12.7% of the deviance in the response.

Method

Link functionLogit
Categorical predictor coding(1, 0)
Rows used71

Response Information

VariableValueCount
Bought122(Event)
  049 
  Total71 

Regression Equation

P(1)=exp(Y')/(1 + exp(Y'))
ChildrenViewAd
NoNoY'=-3.016 + 0.01374 Income
         
NoYesY'=-1.982 + 0.01374 Income
         
YesNoY'=-1.583 + 0.01374 Income
         
YesYesY'=-0.5490 + 0.01374 Income

Coefficients

TermCoefSE CoefZ-ValueP-ValueVIF
Constant-3.0160.939-3.210.001 
Income0.01370.01950.710.4811.15
Children         
  Yes1.4330.8561.670.0941.12
ViewAd         
  Yes1.0340.5721.810.0701.03

Odds Ratios for Continuous Predictors

Odds Ratio90% CI
Income1.0138(0.9819, 1.0469)

Odds Ratios for Categorical Predictors

Level ALevel BOdds Ratio90% CI
Children     
  YesNo4.1902(1.0245, 17.1386)
ViewAd     
  YesNo2.8128(1.0982, 7.2044)
Odds ratio for level A relative to level B

Model Summary

Deviance
R-Sq
Deviance
R-Sq(adj)
AICAICcBICArea Under
ROC Curve
12.66%9.25%84.7785.3793.820.7333

Goodness-of-Fit Tests

TestDFChi-SquareP-Value
Deviance6776.770.194
Pearson6776.110.209
Hosmer-Lemeshow85.580.694

Analysis of Variance



Wald Test
SourceDFChi-SquareP-Value
Regression38.790.032
  Income10.500.481
  Children12.800.094
  ViewAd13.270.070

Fits and Diagnostics for Unusual Observations

ObsObserved
Probability
FitResidStd Resid
501.0000.0622.3572.40R
681.0000.0912.1892.28R
R  Large residual