By default, Minitab Statistical Software produces results for the smallest
tree with a criterion value within one standard error of the best value. The
criterion is either the least squared error or the least absolute deviation,
depending on your choice. Minitab lets you explore other trees from the
sequence that led to the identification of the optimal tree. Typically, you
select an alternative tree for one of the following two reasons:
- The tree that Minitab selects
is part of a pattern where the criterion improves. One or more trees that have
a few more nodes are part of the same pattern. Typically, you want to make
predictions from a tree with as much prediction accuracy as possible.
- The tree that Minitab selects
is part of a pattern where the criterion is relatively flat. One or more trees
with similar model summary statistics have much fewer nodes than the optimal
tree. Typically, a tree with fewer terminal nodes gives a clearer picture of
how each predictor variable affects the response values. A smaller tree also
makes it easier to identify a few target groups for further studies. If the
difference in prediction accuracy for a smaller tree is negligible, you can
also use the smaller tree to evaluate the relationships between the response
and the predictor variables
For example, the following plot accompanies results about the tree with 21
nodes. Other trees in the sequence have similar R
2 values.
The 17-node tree has an R
2 value that is almost as high as the
21-node tree. Typically, a tree with fewer terminal nodes gives a clearer
picture of how each predictor variable affects the response values. A smaller
tree also makes it easier to identify a few target groups for further studies.
If the reduction in prediction accuracy from a much smaller tree is negligible,
you can use the much smaller tree to evaluate the relationships between the
response and the predictor variables.
In addition to the criterion values for alternative trees, you can also
compare the complexity of trees and the usefulness of different nodes. Consider
the following examples of reasons that an analyst chooses a particular tree
that does not sacrifice performance when compared to other trees:
- The analyst chooses a smaller
tree that provides a clearer view of the most important variables.
- The analysis chooses a tree
because the splits are on variables that are easier to measure than the
variables in another tree.
- The analyst chooses a tree
because a particular terminal node is of interest.