Model evaluation by eliminating unimportant or important predictors for Discover Key Predictors with TreeNet® Regression

Note

This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.

Find definitions and interpretation guidance for the model evaluation table.
Note

When you specify the options for Discover Key Predictors, you can choose model selection results for both training and test data. The test results indicate whether the model can adequately predict the response values for new observations, or properly summarize the relationships between the response and the predictor variables. The training results are generally for reference only.

Use the results to compare the models from different steps. To further explore an alternative model from the table, click Select an Alternative Model. Minitab produces a full set of results for the alternative model. You can tune the hyperparameters and make predictions accordingly.

Optimal number of trees

The optimal number of trees usually differs at each step. If the optimal number is close to the total number of trees for the analysis, the model is more likely to improve. You can consider whether to further explore an alternative model that seems likely to improve.

R-squared (%)

R2 is the percentage of variation in the response that the model explains. R2 values range from 0% to 100%.

When you use the squared error loss function or the Huber loss function, then the table includes the R2 value for each model. The results that follow are for the model with the highest R2 value. If a model with a smaller number of terms has an R2 value that is close to the optimal value, then consider whether to further explore the alternative model. A model with fewer predictors is easier to interpret and allows you to work with a smaller number of predictors.

MAD

The mean absolute deviation (MAD) is the average of the absolute value of the difference between a predicted value an actual value. The smaller the MAD, the better the model fits the data. The MAD expresses accuracy in the same units as the data, which helps conceptualize the amount of error.

When you use the absolute deviation loss function, then the table includes the MAD value for each model. The full results that follow the table are for the model with the least MAD value. If a model with a smaller number of terms has an MAD value that is close to the optimal value, then consider whether to further explore the alternative model. A model with fewer predictors is easier to interpret and allows you to work with a smaller number of predictors.

Predictor count

The predictor count is the number of predictors in the model. The number of predictors in the first row of the table is always all the predictors that the analysis considers. After the first row, the number of predictors depends on whether the analysis eliminates unimportant predictors or important predictors.

When the analysis removes the least important predictors, then the number of predictors decreases by a specified number of predictors in each step, plus any predictors that have importance scores of 0. For example, if the analysis eliminates 10 predictors per step, has 900 predictors, and 450 predictors with importance scores of 0 in the initial model, then the first row of the table has 900 predictors. The second row has 440 predictors because the analysis removes the 450 predictors with importance scores of 0 and the 10 least important predictors.

When the analysis removes the most important predictors, then the number of predictors decreases by the specified number of predictors at each step. Predictors that have 0 importance remain in the model.

Eliminated predictors

The column shows the eliminated predictors at each step. The list shows at most 25 predictor titles at a step. The first row always shows "None" because the model has all the predictors. After the first row, the number of predictors depends on whether the analysis eliminates unimportant predictors or important predictors.

When the analysis removes the least important predictors, then the number of predictors decreases by a specified number of predictors in each step, plus any predictors that have 0 importance scores. If the analysis eliminates predictors that have 0 importance scores, then those predictors are first in the list. When the analysis eliminates more than one predictor in either category, the order of the names is the order of the predictors from the worksheet.

When the analysis removes the most important predictors, then the list shows the eliminated predictors from each step. When the analysis eliminates more than one important predictor at a step, then the order of the names in the list is the order of the predictors from the worksheet.