Receiver Operating Characteristic (ROC) curve chart for Fit Binary Logistic Model and Binary Logistic Regression

Training data or no validation

For the chart for a training data set, each point on the chart represents a distinct fitted event probability. The highest event probability is the first point on the chart and appears leftmost. The other terminal nodes are in order of decreasing event probability.

Use the following process to find the x- and y-coordinates for the chart.

Use every event probability as a threshold. For a specific threshold, cases with estimated event probability greater than or equal to the threshold get 1 as the predicted class, 0 otherwise. Then, you can form a 2x2 table for all cases with observed classes as rows and predicted classes as columns to calculate the false positive rate and the true positive rate for each event probability. The false positive rates are the x-coordinates for the chart. The true positive rates are the y-coordinates.

For example, suppose the following table summarizes a model with two, 2-level categorical predictors. These predictors give four distinct event probabilities, which are rounded to 2 decimal places:

A: Order	B: Predictor 1	C: Predictor 2	D: Number of events	E: Number of nonevents	F: Number of trials	G: Threshold (D/F)
1	1	1	18	12	30	0.60
2	1	2	25	42	67	0.37
3	2	1	12	44	56	0.21
4	2	2	4	32	36	0.11
Totals			59	130	189

The following are the corresponding four tables with their respective false positive rates and true positive rates rounded to 2 decimal places:

Table 1. Threshold = 0.60.
False positive rate = 12 / (12 + 118) = 0.09

True positive rate = 18 / (18 + 41) = 0.31
		Predicted
		event	nonevent
Observed	event	18	41
Observed	nonevent	12	118

Table 2. Threshold = 0.37.
False positive rate = (12 + 42) / 130 = 0.42

True positive rate = (18 + 25) / 59 = 0.73
		Predicted
		event	nonevent
Observed	event	43	16
Observed	nonevent	54	76

Table 3. Threshold = 0.21.
False positive rate = (12 + 42 + 44) / 130 = 0.75

True positive rate = (18 + 25 + 12) / 59 = 0.93
		Predicted
		event	nonevent
Observed	event	55	4
Observed	nonevent	98	32

Table 4. Threshold = 0.11.
False positive rate = (12 + 42 + 44 + 32) / 130 = 1

True positive rate = (18 + 25 + 12 + 4) / 59 = 1
		Predicted
		event	nonevent
Observed	event	59	0
Observed	nonevent	130	0

Separate test data set

Use the same steps as the training data set procedure, but calculate the event probability from the cases for the test data set.

Test with k-fold cross-validation

Use the same steps as the training data set procedure, but calculate the event probabilities from the cases for the cross-validated data.