This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.
A team of researchers collects data from the sale of individual residential properties in Ames, Iowa. The researchers want to identify the variables that affect the sale price. Variables include the lot size and various features of the residential property.
After initial exploration with CART® Regression to identify the important predictors, the team uses Random Forests® Regression to create a more intensive model from the same data set. The team compares the model summary table and the R2 plot from the results to evaluate which model provides a better prediction outcome.
These data were adapted based on a public data set containing information on Ames housing data. Original data from DeCock, Truman State University.
Model validation | Validation with out-of-bag data |
---|---|
Number of bootstrap samples | 300 |
Sample size | Same as training data size of 2930 |
Number of predictors selected for node splitting | 30% of the total number of predictors = 23 |
Minimum internal node size | 5 |
Rows used | 2930 |
Mean | StDev | Minimum | Q1 | Median | Q3 | Maximum |
---|---|---|---|---|---|---|
180796 | 79886.7 | 12789 | 129500 | 160000 | 213500 | 755000 |