Method table for MARS® Regression


This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.

Find definitions and interpretation guidance for the Method table.

Criterion for selecting the best model

The criterion that MARS® Regression uses to create the model. MARS® Regression uses either the maximum R-squared (default) or the least mean absolute deviation to select the best model. The mean absolute deviation criterion attempts to decrease the influence of the points with the worst fits compared to the R-squared criterion.

Model validation

MARS® Regression uses the cross-validation method or a separate test set to validate the model. With cross-validation, you can specify the rows for each fold, or allow a random selection. With a separate test set, you can specify the rows for both training and test sets or allow a random selection.

Maximum number of basis functions

The analysis fits this number of basis functions before using backwards elimination of basis functions to select the best model. The default value is 30. Larger values indicate that the analysis made a more thorough search for the optimal model.

Minimum number of observations between knots

A knot is a data point where the basis functions change. By default, the analysis uses sample size and model complexity to automatically select a minimum number. Otherwise, the table displays the specific number for the analysis. A value of 1 indicates that consecutive data points are eligible to be points where the basis function changes. The value of 1 allows the most rapid changes in the model predictions. Consider different values to see the effect on the fit of the model. For example, for some data larger values create smoother models that are less likely to overfit the training data. Such smoother models are sometimes less accurate over certain ranges of the data.

Rows used

The number of response observations that are in the analysis that fits and evaluates the model.

Rows unused

The number of missing response observations. This also includes missing values or zeros in the weight column.