Example of prediction with Discover Best Model (Binary Response)

Note

This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.

A team of researchers collects and publishes detailed information about factors that affect heart disease. Variables include age, sex, cholesterol levels, maximum heart rate, and more. This example is based on a public data set that gives detailed information about heart disease. The original data are from archive.ics.uci.edu.

The researcher can use the random forest classification tree model to predict response class probabilities for new observations.

  1. Complete Example of Discover Best Model (Binary Response).
  2. In the Navigator, select the results for Discover Best Model (Binary Response).
  3. Click the Predict button at the bottom of the results.
  4. From the drop-down list, select Enter individual values.
  5. Enter the following values. This example uses 2 values for each predictor, but you can use up to 3 values. This example also intentionally uses missing values for Exercise Angina.
    Age 35 35  
    Rest Blood Pressure 140 140  
    Cholesterol 233 233  
    Max Heart Rate 150 165  
    Old Peak 2.3 2.3  
    Sex 0 1  
    Chest Pain Type 2 1  
    Fasting Blood Sugar 1 1  
    Rest ECG 0 1  
    Exercise Angina      
    Slope 1 2  
    Major Vessels 0 2  
    Thal 0 0  
  6. Click OK.

Interpret the results

Minitab uses the Random Forests® model in the results to estimate the class probability of a heart disease diagnosis event for the two sets of prediction values. The researchers find that the probability of a heart disease diagnosis event using the specified settings is approximately 0.63 for the first set and 0.52 for the second set.

Random Forests® Classification: Heart Diseas vs Age, Rest Blood P, ...

Method Model validation Validation with out-of-bag data Number of bootstrap samples 300 Sample size Same as training data size of 303 Number of predictors selected for node splitting Square root of the total number of predictors = 3 Minimum internal node size 8 Rows used 303
Binary Response Information Variable Class Count % Heart Disease 1 (Event) 165 54.46 0 138 45.54 All 303 100.00

Random Forests® Classification Predict

Prediction for Heart Disease

Settings Age = 35, Rest Blood Pressure = 140, Cholesterol = 233, Max Heart Rate = 150, Old Peak = 2.3, Sex = 0, Chest Pain Type = 2, Fasting Blood Sugar = 1, Rest ECG = 0, Exercise Angina = *, Slope = 1, Major Vessels = 0, Thal = 0
Prediction Prob (Class Prob (Class Obs Class = 1) = 0) 1 1 0.626667 0.373333

Prediction for Heart Disease

Settings Age = 35, Rest Blood Pressure = 140, Cholesterol = 233, Max Heart Rate = 165, Old Peak = 2.3, Sex = 1, Chest Pain Type = 1, Fasting Blood Sugar = 1, Rest ECG = 1, Exercise Angina = *, Slope = 2, Major Vessels = 2, Thal = 0
Prediction Prob (Class Prob (Class Obs Class = 1) = 0) 2 1 0.516667 0.483333