The R-squared vs Number of Terminal Nodes Plot displays the R2 value for each tree. By default, the initial regression tree is the smallest tree with an R2 value within 1 standard error of the value for the tree that maximizes the R2 value. When the analysis uses cross-validation or a test data set, the R2 value is from the validation sample. The values for the validation sample typically level off and eventually start to decline as the tree grows larger.
Click Select Alternative Tree to open an interactive plot that includes a table of model summary statistics. Use the plot to investigate alternative trees with similar performance.
The regression tree with 21 terminal nodes has an R2 value of approximately 0.78. This tree has the label "Optimal" because the criterion for the creation of the tree was the smallest tree with an R2 value within 1 standard deviation of the maximum R2 value. Because this chart shows that the R2 values are relatively stable between trees with about 20 nodes to trees with about 70 nodes, the researchers want to look at the performance of some of the even smaller trees that are similar to the tree in the results. Compare the next graph to see results for a tree with 17 nodes.
The regression tree with 17 terminal nodes has an R2 value of 0.7661. The tree from the initial results keeps the label "Optimal" when you use Select Alternative Tree to create results for a different tree.
After you select a tree, investigate the distinctive terminal nodes on the tree diagram. For example, you might be interested in nodes with large means or with small standard deviations. From the detailed view, you can see the mean, standard deviation, and total counts for each node.
Right-click the tree diagram to perform the following interactions:
Nodes continue to split until the terminal nodes cannot be split into further groupings. Explore other nodes to see which variables are most interesting.
The tree diagram shows all 4453 cases from the full data set. You can toggle views of the tree between the detailed and node split view.
Then, Node 2 splits by Frequency of Substance Abuse and Node 8 splits by the Alcohol Use. Terminal Node 17 has the cases for Planned Medication Therapy = 2, Alcohol Use = 1, and Referral Source = 3, 5, 6, 100, 300, 400, 600, 700, or 800. The researchers note that Terminal Node 17 has the highest mean, the smallest standard deviation, and the most cases.
Terminal Node 1 has the smallest mean and a standard deviation of about 4.3. Because the mean of Terminal Node 1 is about 5.9 and the response values cannot be negative, the node statistics suggest that data in Terminal Node 1 are probably right-skewed.
Use the relative variable importance chart to see which predictors are the most important variables to the tree.
Important variables are a primary or surrogate splitters in the tree. The variable with the highest improvement score is set as the most important variable, and the other variables are ranked accordingly. Relative variable importance standardizes the importance values for ease of interpretation. Relative importance is defined as the percent improvement with respect to the most important predictor.
Relative variable importance values range from 0% to 100%. The most important variable always has a relative importance of 100%. If a variable is not in the tree, that variable is not important.
Although these results include 33 variables with positive importance, the relative rankings provide information about how many variables to control or monitor for a certain application. Steep drops in the relative importance values from one variable to the next variable can guide decisions about which variables to control or monitor. For example, in these data, the three most important variables have importance values that are relatively close together before a drop of almost 40% in relative importance to the next variable. Similarly, three variables have similar importance values near 50%. You can remove variables from different groups and redo the analysis to evaluate how variables in various groups affect the prediction accuracy values in the model summary table.