Specify the model terms for Discover Best Model (Continuous Response)

Predictive Analytics Module > Automated Machine Learning > Discover Best Model (Continuous Response) > Terms

Note

This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.

Specify how to determine the terms in the regression model. Usually, an analysis that considers linear terms and terms of order 2 in combination with stepwise model selection provides a model with good predictive capability. You can select Forward selection with validation to determine whether the method produces a model with higher prediction accuracy.

If you have a large number of predictors, the selection of the final model can take a long time to consider linear terms and terms of order 2 with stepwise model selection. If the number of predictors is greater than 15, the default selection is to consider only linear terms. To evaluate some higher order terms in addition to linear terms, select to specify the terms in the model.

Terms to include in the regression model

Select whether to use the default terms or to specify your own set of terms.

Linear terms and terms of order 2

The analysis uses all of the linear terms and terms of order 2. Terms of order 2 include all of the interactions between 2 linear terms and square terms for the continuous predictors.

Linear terms

The analysis uses all the linear terms.

Specify terms

You can add interaction terms and polynomial terms to your model. The initial model depends on the number of predictors that you enter in the main dialog box. If the number of predictors is 15 or less, the model contains the linear terms and terms of order 2 for the predictors. If the number of predictors is greater than 15, then the model contains the linear terms. Click Default to return to the initial model.

You can add terms several ways. We use examples to illustrate them. For the examples, assume that the Predictors list has 3 continuous variables (X, Y, Z) and 2 categorical variables, (A, B).

Add terms using selected predictors and model terms

To add terms to the model, select at least one predictor or term. To select multiple items or deselect an item, press the Ctrl key while you click the predictors or terms.

Interactions through order

Add all interactions through the specified order. Suppose you select predictors X, Y, A and add interactions through order 3. When you click Add, Minitab adds X*Y, X*A, Y*A, X*Y*A.

Terms through order

Use to model curvature. This option adds powers and interactions through the specified order. Powers are for continuous predictors. Suppose you select X, Y, A and terms through order 3. When you click Add, Minitab adds the power terms for X and Y: X*X, Y*Y, X*X*X, Y*Y*Y. Minitab also adds interactions for the predictor variables and powers: X*Y, X*A, Y*A, X*X*Y, X*Y*Y, X*X*A, X*Y*A, Y*Y*A.

Cross predictors and terms in the model

This option can be used in the following ways:

You can cross two or more predictors. Suppose you select X, Y, Z. When you click Add, Minitab adds the following terms: X*X, X*Y, X*Z.
You can cross two or more terms that are already in the model. Suppose X*A and X*B are in the model. If you select only these terms and click Add, Minitab adds X*X*A*B.
You can cross predictors with terms in the model. Suppose X*X and Y*Y are in the model. If you select these terms and predictors A, B, and then click Add, Minitab adds X*X*A, X*X*B, Y*Y*A, Y*Y*B. Each predictor is crossed with each model term. The predictors are not crossed with themselves. The model terms are not crossed with themselves.

Note

You may need to deselect predictors or terms so that only the terms you want to cross are selected. To deselect items, press the Ctrl key while you click the predictors or terms.

Terms in the model

When you add terms to the model, the terms are listed in the white space in the dialog box. In this white space, you can select individual terms or groups of terms to remove or reorder.

Default: If the number of predictors is 15 or less, this selection populates the model with linear terms and terms of order 2. If the number of predictors is more than 15, this selection populates the model with the linear terms.
Delete terms: You can delete one or more terms from the model. Select the terms and click Delete (the "X") in the dialog. You can also double-click a term to delete it.
Reorder terms: To move a term, select it, then click one of the arrow buttons in the dialog box to move the term up or down. You can also move a contiguous block of terms. Click the first term then hold the Shift key and click the last term to select the whole block. Then click the appropriate arrow to move the block.

Regression model selection method

Specify whether to use a model selection method. The selections that Minitab presents depend on the size of the data set. The selections combine with selections on the Validation subdialog to provide an analysis that balances rigor and calculation speed:

N < 1,500: The validation method on the Validation subdialog is K-fold cross-validation. The number of folds is 5. The Regression model selection method on the Terms subdialog is Stepwise.
1,500 ≤ N < 2,000: The validation method on the Validation subdialog is K-fold cross-validation. The number of folds is 5. The Regression model selection method on the Terms subdialog is Forward selection with validation.
2,000 ≤ N: The validation method on the Validation subdialog is Validation with a test set. The proportion of data in the test set is 0.3. The Regression model selection method on the Terms subdialog is Forward selection with validation.

Stepwise: This method starts with an empty model. Then, Minitab adds or removes a term for each step. Minitab stops when all variables that are not in the model have p-values that are greater than 0.15 and when all variables that are in the model have p-values that are less than or equal to 0.15.
Forward selection with validation: When you select Forward selection with validation, choose the validation method to test your model. Usually, with smaller samples, the K-fold cross-validation method is appropriate. With larger samples, you can divide the data into a training data set and a test data set. The procedure is similar to forward selection. At the end of each step, Minitab calculates the test R² statistic. At the end of the forward selection procedure, the model with the greatest test R² value is the final model.
The procedure continues until one of the following conditions occurs:
- The procedure does not find an improvement of the criterion for 8 consecutive steps.
- The procedure fits the full model.
- The procedure fits a model that leaves 1 degree of freedom for error.
None: Fit the model with all the terms for the regression model.