One predictor partial dependence plots for MARS® Regression

Note

This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.

Assume there are m predictors in a training data set, denoted as X1, X2, ..., Xm. First, sort the distinct values of predictor X1 in the training data set in increasing order. Denote x11 as the first distinct value of X1. Denote x1N as the last distinct value of X1. Then, x11 is the x-coordinate for the leftmost point on the plot.

Use the following steps to find the y-coordinate at x11 .
  1. Find the fitted value for x11 from only the basis functions that involve the predictor for the plot.
  2. Find the fitted value at evenly distributed points from x11 to x1N
  3. Subtract the minimum fitted value from the fit at x11.
For example, suppose that a model has the following 2 basis functions:
  • BF 1 = max(0, x1 − 350)
  • BF 2 = max(0, x2 - 500)

Also suppose that the model has the following regression equation:

Y = 1000 - 5 * BF1 + 3 * BF2

Last, suppose that x11 = 400 and that the minimum fit of the evenly distributed points is 100.

To find the y-coordinate for a partial dependence plot for X1, consider only the basis functions that involve X1. Then the fit for x11 that considers only the basis function for X1 comes from:

1000 − 5 * (max(0, 400 - 350)) = 1000 − 5*50 = 750.

Then the y-coordinate for x11 is 750 - 100 = 650.

Replacing x11 by evenly distributed values from X1 to XN, we get the y-coordinates for the rest of the points on the plot. These points allow you to investigate the y-coordinates on the plot in detail. The patterns on the plot are approximately the same as a plot with lines that connect points where the basis functions change. The calculations for the rest of the predictors are done similarly.