Run Tune Hyperparameters in the results. . Select
Run Select Alternative Model in the results. . Select
This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.
The performance of TreeNet® models is generally sensitive to values of the learning rate, the subsample fraction, and the complexity of the individual trees that form the model. In the results for a model, click Tune Hyperparameters to evaluate multiple values of these hyperparameters to learn which combination produces the best values of an accuracy criterion, such as the maximum R2 value. Better values of these hyperparameters have the potential to significantly improve prediction accuracy, so the exploration of different values is a common step in the analysis.
You can also adjust the number of predictors for node splitting and the number of trees that the model includes. Usually, the analysis works well when you consider all the predictors at every node. However, some data sets have associations among the predictors that lead to improved model performance when the analysis considers a different random subset of predictors at each node.
In general, 300 trees is enough to distinguish values of the hyperparameters. Generally, you increase the number of trees when the optimal number of trees for one or more models of interest is close to the maximum number of trees. If the number of trees is closer to the maximum number, an increase in the number of trees is more likely to improve the performance of the model.
Specify one or more values for each hyperparameter to evaluate. The analysis evaluates the hyperparameters to find the combination with the best value of the accuracy criterion. If you enter no values for a hyperparameter, the evaluation uses the value for that hyperparameter from the model in the results. If the response is binary and the original model specifies the proportion of events and nonevents to sample, the evaluation always uses the proportions from the original model.
Enter up to 10 values. Eligible values are from 0.0001 to 1. Unless you select Evaluate complete parameter combinations, the evaluation of the learning rate is first. If the evaluation happens first, then the evaluation of the learning rate uses the least value of the learning rate and the subsample fraction.
Enter up to 10 values. Eligible values are greater than 0 and less than or equal to 1. Unless you select Evaluate complete parameter combinations, the evaluation of the subsample is second. If the evaluation happens second, then the evaluation of the subsample fraction uses the best value the analysis found for the learning rate and the least value of the subsample fraction.
Subsample fraction is disabled when the original model specifies the proportion of events and nonevents to sample for a binary response.
Enter up to 3 values. Eligible values are between 1 and the total number of predictors. Usually, the analysis works well when you consider the total number of predictors. However, some data sets have associations among the predictors that lead to improved model performance when the analysis considers a smaller number of predictors for each node.
Enter a value between 1 and 5000 to set the maximum number of trees to build. The default value of 300 usually provides useful results for the evaluation of the hyperparameter values.
If one or more models of interest have a number of trees that is close to the number of trees that you specify, then consider whether to increase the number of trees. If the number of trees is closer to the maximum number, an increase in the number of trees is more likely to improve the performance of the model.
In this example, the analysis that does not evaluate the complete set of parameter combinations includes 8 models in the evaluation table. An analysis of all the parameter combinations has 3 × 3 × 2 = 18 combinations and takes longer to calculate.
After you specify the values to examine, click Display Results. In a new set of results, Minitab produces a table that compares the accuracy criterion for the hyperparameter combinations and the results for the model with the best value of the accuracy criterion.
Minitab recreates the same tables and graphs for the new model as for the original model. The tables and graphs for the new model are in a new set of results. Storage is the same as for the original analysis. The storage columns are in the same worksheet. For example, if the original analysis stored the fitted values in a column titled "Fit," then the new analysis titles an empty column "Fit_1" and stores the fitted values.