Choose the loss function to create your model. You can compare the results from several functions to determine the best choice for your application.
Squared
error: The squared error function is the default function. This is a mean-based loss function. This loss function works well across many applications.
Absolute
deviation: The absolute deviation function is a median-based loss function.
Huber: The Huber function is a hybrid of the squared error and the absolute deviation function.
With the Huber function, specify a Switching
value. The loss function starts as the squared error. The loss function remains the squared error as long as the value is less than the switching value. If the squared error exceeds the switching value, then the loss function becomes the absolute deviation. If the absolute deviation becomes less than the switching value, then the loss function becomes the squared error again.
Number of
trees
Enter a value between 1 and 5000 to set the number of trees to build. The default value of 300 provides useful initial results.
If the initial selected model is close to the number of trees that you specify, then consider whether to increase the number of trees to look for a better model.
Maximum
terminal nodes per tree and Maximum tree
depth
You can also limit the size of the trees. Choose one of the following to limit the size of the trees.
Maximum
terminal nodes per tree: Enter a value between 2 and 2000 to represent the maximum number of terminal nodes of a tree. Usually, the default value of 6 provides a good balance between calculation speed and the investigation of interactions among variables. A value of 2 eliminates the investigation of interactions.
Maximum tree
depth: Enter a value between 2 and 1000 to represent the maximum depth of a tree. The root node corresponds to a depth of 1. The default depth is 4. In many applications, depths from 4 to 6 give reasonably good models.
Minimum
number of cases allowed for a terminal node
Enter the minimum number of cases for a terminal node. For example, if the minimum size is 3 and a split would create a node with less than 3 cases, then Minitab does not perform a split.
Overfitting
protection
Use the following options to minimize overfitting of the model.
Learning
rate
The learning rate is one of the two extremely important hyperparameters that you can tune to identify an optimal model for your data.
By default, if the number of cases in your training data is 1000 or less, Minitab uses 0.01 as the learning rate. For data sets with more than 1000 cases, the default learning rate is max[0.01, 0.1 * min(1.0, N/10000)]. For example, when the data set has 9000 responses, then the learning rate = 0.09.
If the initial model doesn't predict your data well, consider increasing or decreasing the learning rate by 5 or ten fold to see whether you can get a better model.
Subsample
fraction
Specify the proportion of the learning data to randomly select to build each tree in the analysis. Usually, the fraction of 0.5 works well. Consider increasing the fraction from the default value of 0.5 to 0.70 or higher if the initial model doesn't fit your data well.
Number of
predictors for node splitting
Specify the number of predictors to consider for each node split. Typically, the analysis works well when you consider all the predictors at every node. However, some data sets have associations among the predictors that lead to improved model performance when the analysis considers a different random subset of predictors at each node. For such cases, the square root of the total number of predictors is a typical starting point. After you use the square root and view the model, you can consider whether to specify a larger or smaller number of predictors with a percentage of the total.
Total number
of predictors: Select to use all the predictors for splitting nodes.
Square
root of the total number of predictors: Select to use the square root of the total number of predictors for splitting nodes.
K
percent of the total number of predictors; K =: Select to use a percentage of predictors for splitting nodes.
Base for random number
generator
You can specify a base for the random number generator to randomly select the subsamples and the subset of predictors. Typically, you do not need to change the base. You can change the base to explore how sensitive the results are to the random selections or to ensure the same random selection for repeated analyses.
Weights
Enter a column that contains the case weights. The column must have the same number of rows as the response column. Values must be ≥ 0. Minitab omits rows that contain missing values or zeros from the analysis.