Select an alternative tree for CART® Regression

Run Stat > Predictive Analytics > CART® Regression. Click the Select an Alternative Tree button for the R-squared vs. number of terminal nodes plot or the Mean absolute deviation vs. number of terminal nodes plot.

Overview

By default, Minitab Statistical Software produces results for the smallest tree with a criterion value within one standard error of the best value. The criterion is either the least squared error or the least absolute deviation, depending on your choice. Minitab lets you explore other trees from the sequence that led to the identification of the optimal tree. Typically, you select an alternative tree for one of the following two reasons:
  • The tree that Minitab selects is part of a pattern where the criterion improves. One or more trees that have a few more nodes are part of the same pattern. Typically, you want to make predictions from a tree with as much prediction accuracy as possible.
  • The tree that Minitab selects is part of a pattern where the criterion is relatively flat. One or more trees with similar model summary statistics have much fewer nodes than the optimal tree. Typically, a tree with fewer terminal nodes gives a clearer picture of how each predictor variable affects the response values. A smaller tree also makes it easier to identify a few target groups for further studies. If the difference in prediction accuracy for a smaller tree is negligible, you can also use the smaller tree to evaluate the relationships between the response and the predictor variables
For example, the following plot accompanies results about the tree with 21 nodes. Other trees in the sequence have similar R2 values.
The 17-node tree has an R2 value that is almost as high as the 21-node tree. Typically, a tree with fewer terminal nodes gives a clearer picture of how each predictor variable affects the response values. A smaller tree also makes it easier to identify a few target groups for further studies. If the reduction in prediction accuracy from a much smaller tree is negligible, you can use the much smaller tree to evaluate the relationships between the response and the predictor variables.

Perform the analysis

Click Select an Alternative Tree in the output. A dialog box opens that shows the plot and a model summary table. The dialog box provides three ways to select alternative trees:
  • Click a point on the graph.
  • Click the arrow buttons under the model summary table to select a tree that is one tree larger or smaller than the current selection.
  • Click a button to select a tree that is a common choice. The choices depend on whether the criterion for the optimal tree is the least squared error or the least absolute deviation. When the analysis does not use validation, the buttons that refer to the standard error do not apply.
    Least squared error
    Max R-squared
    Select the tree with the largest R2 value on the plot.
    1-SE Max R-sq
    Select the smallest tree that has an R2 value within one standard error of the largest R2 value.
    2-SE Max R-sq
    Select the smallest tree that has an R2 value within 2 standard errors of the largest R2 value.
    Least absolute deviation
    Min MAD
    Select the tree with the smallest Mean Absolute Deviation (MAD) value on the plot.
    1-SE MAD
    Select the smallest tree that has an MAD value within one standard error of the smallest MAD value.
    2-SE MAD
    Select the smallest tree that has an MAD value within 2 standard errors of the smallest MAD value.

Click Create Tree to create and store results for an alternative tree that you choose. The selections for results and storage are the same as for the original tree. The graphs and tables for the alternative tree are in a new output tab. The stored columns are in the worksheet with the original data.

By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy