Any regression tree is a collection of splits. Each split provides improvement to the tree. Each split also includes surrogate splits that also provide improvement to the tree. The importance of a variable is given by all of its improvements when the tree uses the variable to split a node or as a surrogate to split a node when another variable has a missing value. The following formula gives the improvement at a single node:
The values of I(t), pLeft, and pRight depend on the criterion for splitting the nodes. For more information, go to Node splitting methods in CART® Regression.
R2 is also known as the coefficient of determination.
|yi||i th observed response value|
|i th fitted response|
|N||number of records|