Confusion matrix for Random Forests® Classification

Note

This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.

The Confusion matrix shows how well the tree separates the classes correctly using these metrics:
  • True positive rate (TPR) — the probability that an event case is predicted correctly
  • False positive rate (FPR) — the probability that a nonevent case is predicted incorrectly
  • False negative rate (FNR) — the probability that an event case is predicted incorrectly
  • True negative rate (TNR) — the probability that a nonevent case is predicted correctly

Interpretation

Random Forests® Classification: Heart Diseas vs Age, Rest Blood P, ...

Confusion Matrix Predicted Class (Out-of-Bag) Actual Class Count Yes No % Correct Yes (Event) 139 109 30 78.42 No 164 26 138 84.15 All 303 135 168 81.52 Out-of-Bag Statistics (%) True positive rate (sensitivity or power) 78.42 False positive rate (type I error) 15.85 False negative rate (type II error) 21.58 True negative rate (specificity) 84.15

In this example, the total number of Yes events is 139, and the total number of No is 164. The analysis uses out-of-bag data to validate the model.

In the out-of-bag data, the total number of Yes events is 139 and the total number of No outcomes is 164.
  • The number of predicted events (Yes) in the out-of-bag data is 109, which is 78.42% correct.
  • The number of predicted nonevents (No) in the out-of-bag data is 138, which is 84.15% correct.

Overall, the %Correct for the out-of-bag data is 81.52%. Use the results for the out-of-bag data to evaluate the prediction accuracy for new observations.

A low value for % Correct is usually due to a deficient fitted model. Various problems lead to a deficient model. If the % Correct is very low, consider whether to modify the minimum number of cases to split an internal node or to change the number of predictors that the analysis considers for splitting a node.