Methods and formulas for partial dependence plots in Fit Model and Discover Key Predictors with TreeNet^® Classification

Note

This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.

One predictor partial dependence plots

Assume there are m predictors in a training data set, denoted as x₁, x₂, ..., x_m. First, sort the distinct values of predictor x₁ in the training data set in increasing order. Denote x₁₁ as the first distinct value of x₁. Then, x₁₁ is the x-coordinate for the leftmost point on the plot.

The y-coordinate at x₁ = x₁₁ equals

Term	Description
N	the total number of rows in the training data set
	the observed values for in the training data set
j	each individual row of the J rows
	the fitted value from the model when x₁ = x₁₁, x₂ = x_2j,...., x_m = x_mj

Replacing x₁₁ by each of the distinct values of x₁, we get the y-coordinates for the rest of the points on the plot. The calculations for the rest of the predictors are done similarly.

Calculations of all the y-coordinates for all distinct values of x can be time consuming with large data sets. For TreeNet^®, there is a faster way to do the calculations. Refer to Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), page 1221.

The calculations for multinomial response case are similar. Here the fitted value is from the model for each individual class.

Two predictor partial dependence plots

Assume there are m predictors in a training data set, denoted as x₁, x₂, ..., x_m. First, sort the distinct values of predictors x₁, x₂ in the training data set in increasing order. Denote x₁₁, x₂₁ as one of the distinct pairs. Then, each pair makes the x and y-coordinates for a point on the surface plot.

The z-coordinate at x₁ = x₁₁, x₂ = x₂₁ equals

Term	Description
N	the total number of rows in the training data set that all share the commonality of x₁ = x₁₁, x₂ = x₂₁
	the observed values for in the training data set
j	each individual row of the J rows
	the fitted value from the model when x₁ = x₁₁, x₂ = x₂₁, x₃ = x_3j...., x_m = x_mj

The completion of the calculations for all distinct value combinations of x₁ and x₂ produces all the z-coordinates for the contour or surface plot. For large data sets, the calculations for all distinct pairs of x and y are time consuming. For TreeNet^® models, there is a faster way to do the calculations. Refer to Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), page 1221.

Methods and formulas for partial dependence plots in Fit Model and Discover Key Predictors with TreeNet® Classification

Note

One predictor partial dependence plots

Two predictor partial dependence plots

Methods and formulas for partial dependence plots in Fit Model and Discover Key Predictors with TreeNet^® Classification