Methods and formulas for the percent of error statistics due to largest residuals in CART® Regression

Select the method or formula of your choice.

For the percent of error statistics, the value depends on the percentage of the largest residuals in the calculation. In the following formulas, the calculations assume that the residuals are in order by absolute value, such that i = 1 represents the residual with the greatest absolute value and i = N represents the residual with the least absolute value.

When you use k-fold cross validation, the training statistics include the fitted values from the final tree for the full data set. The test statistics use fitted values from the validation process that can have different trees for each fold.

When you use a test data set for validation, the test statistics use fitted values for the test data set only.

% MSE

% MAD

% MAPE

Notation

TermDescription
ccount of largest residuals for the percentage
yi i th observed response value
mean response
i th fitted response
Nnumber of records