Total predictors | 77 |
---|---|
Important predictors | 10 |
Maximum number of basis functions | 30 |
Optimal number of basis functions | 13 |
Statistics | Training | Test |
---|---|---|
R-squared | 89.61% | 87.61% |
Root mean squared error (RMSE) | 25836.5197 | 27855.6550 |
Mean squared error (MSE) | 667525749.7185 | 775937512.8264 |
Mean absolute deviation (MAD) | 17506.0038 | 17783.5549 |
In these results, the test R-squared is about 88%. The test root mean squared error is about 27,856. The test mean squared error is about 775,937,513. The test mean absolute deviation is about 17,784.
Use the relative variable importance chart to see which predictors are the most important variables to the model.
Important variables are in at least 1 basis function in the model. The variable with the highest improvement score is set as the most important variable and the other variables are ranked accordingly. Relative variable importance standardizes the importance values for ease of interpretation. Relative importance is defined as the percent improvement with respect to the most important predictor.
Relative variable importance values range from 0% to 100%. The most important variable always has a relative importance of 100%. If a variable is not in a basis function, that variable is not important.
Although these results include 10 variables with positive importance, the relative rankings provide information about how many variables to control or monitor for a certain application. Steep drops in the relative importance values from one variable to the next variable can guide decisions about which variables to control or monitor. For example, in these data, the 2 most important variables have importance values that are relatively close together before a drop of over 20% in relative importance to the next variable. Similarly, 2 variables have similar importance values over 60%. You can remove variables from different groups and redo the analysis to evaluate how variables in various groups affect the prediction accuracy values in the model summary table.
Use the partial dependence plots, the basis functions, and the coefficients in the regression equation to determine the effect of the predictors. The effects of the predictors explain the relationship between the predictors and the response. Consider all the basis functions for a predictor to understand the effect of the predictor on the response variable.
In addition, consider the use of the important predictors and the forms of their relationships when you build other models. For example, if the MARS® regression model includes interactions, consider whether to include those interactions in a least-squares regression model to compare the performance of the two types of models. In applications where you control the predictors, the effects provide a natural way to optimize the settings to achieve a goal for the response variable.
In an additive model, one-predictor, partial dependence plots show how the important continuous predictors affect the predicted response. The one predictor partial dependence plot indicates how the response is expected to change with changes in the predictor levels. For MARS® Regression, the values on the plot come from the basis functions for the predictor on the x-axis. The contribution on the y-axis is standardized so that the minimum value on the plot is 0.
This plot illustrates that Sale Price increases as the Living Area SF increases from the minimum square footage in the data set to about 3,000 square
feet. After Living Area SF reaches 3,000 square feet, the contribution to Sale Price becomes flat at approximately $152,000.
In these results, BF2 has a negative coefficient in the regression equation. The coefficient for the basis function is -57.6167. The arrangement of the basis function is max(0, c - X). In this arrangement, the value of the basis function decreases when the predictor increases. The combination of this arrangement and the negative coefficient creates a positive relationship between the predictor variable and the response variable. The slope of Living Area SF is 57.6167 from 438 to 3,078.
For more examples of common basis functions, go to Regression equation for MARS® Regression.