Effectiveness of classification for best and worst terminal nodes for CART^® Classification

Use the effectiveness of classification statistics to characterize nodes of special interest because of their performance.

Minitab displays a table for each response level. Each row of the table shows a summary for a node. For both binary cases and multinomial cases, Minitab sorts on the class probabilities using the expression, Abs(event probability – 0.5). The higher the value, the better the terminal node. The best nodes are in order from best to worst. The worst nodes are in order from worst to best.

If there are ties when using the class probabilities, then Minitab uses % of N as second sorting. The terminal node with highest % of N is first. If there are still ties after this sorting, then Minitab displays the smallest terminal node first in "Best", "Worst" and "Best and Worst" scenarios.

Note

Even in the "Worst" node ordering, the tie-breaker should show largest % N first, thus "Best" and "Worst" ordering will not always be opposite of one another.

When you use a test data set, Minitab calculates separate statistics for the training and test data. You can compare the statistics to examine the relative performance of the tree on the training data and on new data. The test statistics are usually a better measure of how the tree performs for new data. The terminal nodes for Training and Test are ranked separately based on the event probability from each. Terminal nodes that have no observations for the Test data have no event probability, thus, these nodes are not considered.

Terminal Node: The identification of the terminal node.
Event Count or Class Count: The count is the number of cases in the node for the event or nonevent or for the class. If the analysis includes weights, then the count is the weighted count. Terminal nodes with many cases can be of special interest because these nodes typically represent more common cases.
Total Count: The total count is the total of event and nonevent cases or the total of all the class counts.
% of N: The percent of the data in the node.
Event Prob or Class Prob: The event probability is for binary response variables and class probability is for multinomial response variables.
Non-Event Prob or Non-Class Prob: The non-event probability is for binary response variables and non-class probability is for multinomial response variables.
Odds: The odds indicates the ratio of the probability of the event to the non-event or the class to the non-class.

Effectiveness of classification for best and worst terminal nodes for CART® Classification

Note

Effectiveness of classification for best and worst terminal nodes for CART^® Classification