Methods and formulas for the model summary in CART® Regression

Select the method or formula of your choice.

Important predictors

The number of predictors with positive relative importance.

Any regression tree is a collection of splits. Each split provides improvement to the tree. Each split also includes surrogate splits that also provide improvement to the tree. The importance of a variable is given by all of its improvements when the tree uses the variable to split a node or as a surrogate to split a node when another variable has a missing value. The following formula gives the improvement at a single node:

The values of I(t), pLeft, and pRight depend on the criterion for splitting the nodes. For more information, go to Node splitting methods in CART® Regression.

The formula for the relative importance for the qth predictor scales the importance by the most important variable:

R-squared

R2 is also known as the coefficient of determination.

Root mean squared error (RMSE)

Mean squared error (MSE)

Mean absolute deviation (MAD)

Mean absolute percent error (MAPE)

Notation

TermDescription
yi i th observed response value
mean response
i th fitted response
Nnumber of records