Example of Fit Binary Logistic Model

A marketing consultant for a cereal company investigates the effectiveness of a TV advertisement for a new cereal product. The consultant shows the advertisement in a specific community for one week. Then the consultant randomly samples adults as they leave a local supermarket to ask whether they saw the advertisements and bought the new cereal. The consultant also asks adults whether they had children and what their annual household income is.

Because the response is binary, the consultant uses binary logistic regression to determine how the advertisement, having children, and annual household income are related to whether or not the adults sampled bought the cereal.

Open the sample data, CerealPurchase.MTW.
Choose Stat > Regression > Binary Logistic Regression > Fit Binary Logistic Model.
From the drop-down list, select Response in binary response/frequency format.
In Response, enter Bought.
In Continuous predictors, enter Income.
In Categorical predictors, enter Children ViewAd.
Click Options. Under Confidence level for all intervals, enter 90.
Click OK in each dialog box.

Interpret the results

The Analysis of variance table shows which predictors have a statistically significant relationship with the response. The consultant uses a 0.10 significance level and the results indicate that the predictors Children and ViewAd have a statistically significant relationship with the response. Income does not have a statistically significant relationship with the response because the p-value is greater than 0.10. The consultant may want to refit the model without the income variable.

The odds ratio indicates that adults with children are approximately 4.2 times more likely to purchase the cereal than adults without children. The odds ratio for adults that saw the ad indicates that they are 2.8 times more likely to purchase the cereal than adults who have not seen the ad.

The goodness-of-fit tests are all greater than the significance level of 0.05, which indicates that there is not enough evidence to conclude that the model does not fit the data. The R² value indicates that the model explains approximately 12.7% of the deviance in the response.

Method

Link function	Logit
Categorical predictor coding	(1, 0)
Rows used	71

Response Information

Variable	Value	Count
Bought	1	22	(Event)
	0	49
	Total	71

Regression Equation

P(1)	=	exp(Y')/(1 + exp(Y'))

Children	ViewAd
No	No	Y'	=	-3.016 + 0.01374 Income

No	Yes	Y'	=	-1.982 + 0.01374 Income

Yes	No	Y'	=	-1.583 + 0.01374 Income

Yes	Yes	Y'	=	-0.5490 + 0.01374 Income

Coefficients

Term	Coef	SE Coef	Z-Value	P-Value	VIF
Constant	-3.016	0.939	-3.21	0.001
Income	0.0137	0.0195	0.71	0.481	1.15
Children
Yes	1.433	0.856	1.67	0.094	1.12
ViewAd
Yes	1.034	0.572	1.81	0.070	1.03

Odds Ratios for Continuous Predictors

	Odds Ratio	90% CI
Income	1.0138	(0.9819, 1.0469)

Odds Ratios for Categorical Predictors

Level A	Level B	Odds Ratio	90% CI
Children
Yes	No	4.1902	(1.0245, 17.1386)
ViewAd
Yes	No	2.8128	(1.0982, 7.2044)

Model Summary

Deviance R-Sq	Deviance R-Sq(adj)	AIC	AICc	BIC	Area Under ROC Curve
12.66%	9.25%	84.77	85.37	93.82	0.7333

Goodness-of-Fit Tests

Test	DF	Chi-Square	P-Value
Deviance	67	76.77	0.194
Pearson	67	76.11	0.209
Hosmer-Lemeshow	8	5.58	0.694

Analysis of Variance

		Wald Test
Source	DF	Chi-Square	P-Value
Regression	3	8.79	0.032
Income	1	0.50	0.481
Children	1	2.80	0.094
ViewAd	1	3.27	0.070

Fits and Diagnostics for Unusual Observations

Obs	Observed Probability	Fit	Resid	Std Resid
50	1.000	0.062	2.357	2.40	R
68	1.000	0.091	2.189	2.28	R