Select the options for Random Forests® Classification

Predictive Analytics Module > Random Forests® Classification > Options
Note

This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.

Number of bootstrap samples to grow trees
Enter a value to determine the number of bootstrap samples and the number of trees produced by the analysis. Enter a value between 3 and 3000.
Specify a bootstrap sample size less than the training data size
Select to enter a value that sets the bootstrap sample size. You must enter a value greater than or equal to 5. If you enter a size that is greater than the training data size, Minitab uses a sample size equal to the training data size.
Number of predictors for node splitting
Specify the number of predictors to consider for each node split. Typically, the analysis works well when you consider the square root of the total number of predictors. However, some data sets have associations among the predictors that lead to improved model performance when the analysis considers a larger or smaller number of predictors for each node. After you use the square root and view the model, consider whether to change the number of predictors to try to improve the performance of the model.
  • Square root of the total number of predictors: Select to use the square root of the total number of predictors for splitting nodes.
  • Total number of predictors, producing a bootstrap forest: Select to use all the predictors for splitting nodes. The forest created by this option is called a bootstrap forest.
  • K percent of the total number of predictors; K =: Select to use a percentage of predictors for splitting nodes.
Base for random number generator
You can specify a base for the random number generator to randomly select the subsamples and the subset of predictors. Typically, you do not need to change the base. You can change the base to explore how sensitive the results are to the random selections or to ensure the same random selection for repeated analyses.
Minimum number of cases to split an internal node
Enter the minimum number of cases a node can have and still split into more nodes. When the sample size is 2,000 or less, the default is 2 so that all nodes can be split into smaller nodes until another split is impossible. For larger sample sizes, the default value is 5. If the model performance is inadequate, consider whether to change this value to see the effect on the performance.