Find definitions and interpretations for every statistic in the
Model summary table.

The number of total predictors available for the tree. This is the sum of the continuous predictors and the categorical predictors that you specify.

The number of important predictors in the tree. Important predictors are the variables that are used as primary or surrogate splitters.

You can use the Relative Variable Importance plot to display the order of relative variable importance. For instance, suppose 10 of 20 predictors are important in the tree, the Relative Variable Importance plot displays the variables in importance order.

A terminal node is a final node that cannot be split further.

You can use terminal node information to make predictions.

The minimum terminal node size is the terminal node with the smallest number of cases.

By default, Minitab sets the minimum number of cases allowed for a terminal node as 3 cases; however, the minimum terminal node size in a tree can be larger than the minimum number that the analysis allows. You can change this threshold value in the Options subdialog box.

R^{2} is the percentage of variation in the response that the model
explains. Outliers have a greater effect on R^{2} than on MAD and MAPE.

When you use a validation method, the table includes an R^{2}
statistic for the training data set and an R^{2} statistic for the test
data set. When the validation method is k-fold cross-validation, the test data
set is each fold when the tree building excludes that fold. The test
R^{2} statistic is typically a better measure of how the model works
for new data.

Use R^{2} to determine how well the model fits your data. The
higher the R^{2} value, the better the model fits your data.
R^{2} is always between 0% and 100%.

You can graphically illustrate the meaning of different R^{2}
values. The first plot illustrates a simple regression model that explains
85.5% of the variation in the response. The second plot illustrates a model
that explains 22.6% of the variation in the response. The more variation that
is explained by the model, the closer the data points fall to the fitted
values. Theoretically, if a model can explain 100% of the variation, the fitted
values would always equal the observed values and all of the data points would
fall on the line y = x.

A test R^{2} that is substantially less than the training
R^{2} indicates that the tree might not predict the response values for
new cases as well as the tree fits the current data set.

The root mean square error (RMSE) measures the accuracy of the tree. Outliers have a greater effect on RMSE than on MAD and MAPE.

When you use a validation method, the table includes an RMSE statistic for the training data set and an RMSE statistic for the test data set. When the validation method is k-fold cross-validation, the test data set is each fold when the tree building excludes that fold. The test RMSE statistic is typically a better measure of how the model works for new data.

Use to compare the fits of different trees. Smaller values indicate a better fit. A test RMSE that is substantially less than the training RMSE indicates that the tree might not predict the response values for new cases as well as the tree fits the current data set.

The mean square error (MSE) measures the accuracy of the tree. Outliers have a greater effect on MSE than on MAD and MAPE.

When you use a validation method, the table includes an MSE statistic for the training data set and an MSE statistic for the test data set. When the validation method is k-fold cross-validation, the test data set is each fold when the tree building excludes that fold. The test MSE statistic is typically a better measure of how the model works for new data.

Use to compare the fits of different trees. Smaller values indicate a better fit. A test MSE that is substantially less than the training MSE indicates that the tree might not predict the response values for new cases as well as the tree fits the current data set.

The mean absolute deviation (MAD) expresses accuracy in the same units as
the data, which helps conceptualize the amount of error. Outliers have less of
an effect on MAD than on R^{2}, RMSE, and MSE.

When you use a validation method, the table includes an MAD statistic for the training data set and an MAD statistic for the test data set. When the validation method is k-fold cross-validation, the test data set is each fold when the tree building excludes that fold. The test MAD statistic is typically a better measure of how the model works for new data.

Use to compare the fits of different trees. Smaller values indicate a better fit. A test MAD that is substantially less than the training MAD indicates that the tree might not predict the response values for new cases as well as the tree fits the current data set.

The mean absolute percent error (MAPE) expresses accuracy as a percentage of
the error. Because the MAPE is a percentage, it can be easier to understand
than the other accuracy measure statistics. For example, if the MAPE, on
average, is 0.05, then the average ratio between the fitted error and the
actual value across all cases is 5%. Outliers have less of an effect on MAPE
than on R^{2}, RMSE, and MSE.

However, sometimes you may see a very large MAPE value even though the tree appears to fit the data well. Examine the fitted vs actual response value plot to see if any data values are close to 0. Because MAPE divides the absolute error by the actual data, values close to 0 can greatly inflate the MAPE.

When you use a validation method, the table includes an MAPE statistic for the training data set and an MAPE statistic for the test data set. When the validation method is k-fold cross-validation, the test data set is each fold when the tree building excludes that fold. The test MAPE statistic is typically a better measure of how the model works for new data.

Use to compare the fits of different trees. Smaller values indicate a better fit. A test MAPE that is substantially less than the training MAPE indicates that the tree might not predict the response values for new cases as well as the tree fits the current data set.