This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.
For a given tree in the forest, a class vote for a row in the out-of-bag data is the predicted class for the row from the single tree. The predicted class for a row in out-of-bag data is the class with the highest vote across all trees in the forest. The predicted class probability for a row in the out-of-bag data is the ratio of the number of votes for the class and the total votes for the row.
For the curve for the out-of-bag data, each point on the chart represents a distinct predicted class probability. The highest event probability is the first point on the chart and appears leftmost. The other probabilities are in decreasing order.
Use the following process to find the x- and y-coordinates for the chart.
For example, suppose the following table summarizes a simplistic model with two, 2-level categorical predictors. These predictors give four distinct event probabilities, which are rounded to 2 decimal places:
A: Order | B: Predictor 1 | C: Predictor 2 | D: Number of events | E: Number of nonevents | F: Number of trials | G: Threshold (Fitted event probability) |
---|---|---|---|---|---|---|
1 | 1 | 1 | 18 | 12 | 30 | 0.60 |
2 | 1 | 2 | 25 | 42 | 67 | 0.37 |
3 | 2 | 1 | 12 | 44 | 56 | 0.21 |
4 | 2 | 2 | 4 | 32 | 36 | 0.11 |
Totals | 59 | 130 | 189 |
The following are the corresponding four tables with their respective false positive rates and true positive rates rounded to 2 decimal places:
Predicted | |||
---|---|---|---|
event | nonevent | ||
Observed | event | 18 | 41 |
nonevent | 12 | 118 |
Predicted | |||
---|---|---|---|
event | nonevent | ||
Observed | event | 43 | 16 |
nonevent | 54 | 76 |
Predicted | |||
---|---|---|---|
event | nonevent | ||
Observed | event | 55 | 4 |
nonevent | 98 | 32 |
Predicted | |||
---|---|---|---|
event | nonevent | ||
Observed | event | 59 | 0 |
nonevent | 130 | 0 |
Use the same steps as the out-of-bag procedure, but calculate the event probabilities from the cases in the test set.