Optimization of hyperparameters for Fit Model and Discover Key Predictors with TreeNet® Regression

Note

This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.

Use the results to compare how well models perform with different settings for the hyperparameters. Click Tune Hyperparameters to Identify a Better Model to evaluate additional values of the hyperparameters.

Optimal number of trees

The optimal number of trees usually differs at each step. When the optimal number is close to the maximum number of trees for the analysis, the model is more likely to improve if you increase the number of trees than a model with an optimal number of trees that is far from the maximum. You can consider whether to further explore an alternative model that seems likely to improve.

R-squared (%)

R2 is the percentage of variation in the response that the model explains. Outliers have a greater effect on R2 than on MAD.

When you use the squared error loss function or the Huber loss function, then the table includes the R2 value for each model. The results that follow are for the model with the highest R2 value.

MAD

The mean absolute deviation (MAD) is the average of the absolute value of the difference between a predicted value an actual value. The smaller the MAD, the better the model fits the data. The MAD expresses accuracy in the same units as the data, which helps conceptualize the amount of error. Outliers have less of an effect on MAD than on R2.

When you use the absolute deviation loss function, then the table includes the MAD value for each model. The full results that follow the table are for the model with the least MAD value.

Learning rate

Low learning rates weigh each new tree in the model less than higher learning rates and sometimes produce more trees for the model. A model with a low learning rate has less chance of overfitting the training data set. Models with low learning rates generally use more trees to find the optimal number of trees.

Subsample fraction

The subsample fraction is the proportion of the data that the analysis uses to build each tree.

Maximum terminal nodes per tree

TreeNet® Regression combines many small CART® trees into a powerful model. You can specify either the maximum number of terminal nodes or the maximum tree depth for these smaller CART® trees. Trees with more terminal nodes can model more complex interactions. In general, values above 12 could slow the analysis without much benefit to the model.

Maximum tree depth

TreeNet® Regression combines many small CART® trees into a powerful model. You can specify either the maximum number of terminal nodes or the maximum tree depth for these smaller CART® trees. Deeper trees can model more complex interactions. Values from 4 to 6 are adequate for many datasets.