With Minitab Model Ops, we can easily compare an operationalized robust Minitab model, such as a Random Forests® Regression model with a non-Minitab model.
In this example, we use the Ames housing data set to compare a Multi-layer Perceptron Regressor (MLPRegressor) created in scikit-learn to an operationalized Random Forests Regression model created in Minitab Statistical Software.
A multi-layer perceptron model is a neural-network with at least 3 layers of nodes: an input, an output, and one or more hidden layers. While MLP regression models are deep learning models that work well with very large data sets, they often require extensive computational resources to calculate best outcomes.
Random Forests, meanwhile, is a decision tree-based algorithm that combines several individual decision trees into a single output. Random forest models are machine learning models that are quite powerful with large data sets, and usually require less computing resources than MLP regression models.
To compare the models, we can use performance statistics, such as R2, to assess which model is better. Minitab Model Ops calculates R2 and the Mean Absolute Deviation (MAD) for models with continuous responses on a rolling basis, to assess whether model drift or dataset drift occurs. When drift is present, data scientists consider when to retrain or replace the model.
For the MLP model, this notebook calculates one R2 value and one MAD value at a given time, to demonstrate the use and operationalization of proprietary Minitab algorithms in conjunction with freely available Python models.
These data were adapted based on a public data set containing information on Ames housing data. Original data from DeCock, Truman State University.
To learn more about creating a Random Forests® Regression model in Minitab, go to Example of Random Forests® Regression.