Specify the default settings for TreeNet® Classification

File > Options > Predictive Analytics > TreeNet® Classification

Specify the default methods for TreeNet® Classification. The changes you make to the defaults remain until you change them again, even after you exit Minitab.

Criterion for selecting optimal number of trees with binary response
Choose the method to generate your optimal model. You can compare the results from several methods to determine the best choice for your application.
  • Maximum loglikelihood: The maximum likelihood method finds the maximum of the likelihood functions for the data.
  • Maximum area under ROC curve: The maximum area under ROC curve method works well across many applications. The area under the ROC curve measures how well the model ranks rows from most likely to produce an event to least likely to produce an event. This option is available with a binary response.
  • Minimum misclassification rate: Select this option to display results for the model that minimizes the misclassification rate. The misclassification rate is based on a simple count of how often the model predicts a case correctly or incorrectly.
Criterion for selecting optimal number of trees with multinomial response
Choose the method to generate your optimal model. You can compare the results from several methods to determine the best choice for your application.
  • Minimum misclassification rate: Select this option to display results for the model that minimizes the misclassification rate. The misclassification rate is based on a simple count of how often the model predicts a case correctly or incorrectly.
  • Maximum loglikelihood: The maximum likelihood method finds the maximum of the likelihood functions for the data.
Maximum terminal nodes per tree and Maximum tree depth
You can also limit the size of the trees. Choose one of the following to limit the size of the trees.
  • Maximum terminal nodes per tree: Enter a value between 2 and 2000 to represent the maximum number of terminal nodes of a tree. Usually, 6 provides a good balance between calculation speed and the investigation of interactions among variables. A value of 2 eliminates the investigation of interactions.
  • Maximum tree depth: Enter a value between 2 and 1000 to represent the maximum depth of a tree. The root node corresponds to a depth of 1. In many applications, depths from 4 to 6 give reasonably good models.
Missing value penalty
Enter a penalty value for a predictor with missing values. Because it is easier to be a good splitter with less data, predictors with missing data have an advantage over predictors without missing data. Use this option to penalize predictors with missing data.
0.0 ≤ K ≤ 2.0, for example:
  • K = 0: Specifies no penalty.
  • K = 2: Specifies highest penalty.
High level category penalty
Enter a penalty value for categorical predictors that have many values. Because categorical predictors with many levels can distort a tree due to their increased splitting power, they have an advantage over predictors with less levels. Use this option to penalize predictors with many levels.
0.0 ≤ K ≤ 5.0, for example:
  • K = 0: Specifies no penalty.
  • K = 5: Specifies highest penalty.