This macro computes the multiple case extension of Cook's single case distance measure. Depending on the data set size, the distance measure can be computed for all case pairs and triplets. In addition, the distance measure can be computed for user selected subsets of up to ten cases. Graphs produced include a plot of Cook's distance for single cases against case number, an influential case pairs ID plot, and fixed-pair effect plots which show the effect, or change in Cook's distance, due to adding a third case to a fixed pair of cases. Like functionality is available for models with no constant term
Be sure that Minitab knows where to find your downloaded macro. Choose . Under Macro location browse to the location where you save macro files.
If you use an older web browser, when you click the Download button, the file may open in Quicktime, which shares the .mac file extension with Minitab macros. To save the macro, right-click the Download button and choose Save target as.
The syntax used to run the macro varies slightly depending on the version you are using.
The following example uses the sample data which is the "Modified Data on Wood Specific Gravity" data set of twenty cases and five predictors in Rousseeuw and Leroy (1987). The computational results for the five selected case subsets match those given in Seaver, Triantis, and Reeves (1999).
Suppose the values of the response Y, specific gravity, are in C1 and the values of the five predictors, X1-X5, are in columns 2-6. Five subset cases were selected.
%MULTDIST C1-C6;
SUB1 5;
SUB2 8 19;
SUB3 6 8 19;
SUB4 4 8 19;
SUB5 4 6 8 19.
Click Run.
Here is what the macro will produce.
Multiple Case Cook's Distance Model Information ------------------------ Response: Y Predictors: X1 , X2 , X3 , X4 , X5 Parameters: 6 Threshold value: 1.00 ------------------------ *** Cook's Distance for Case Pairs *** Cases Cook's Distance 7 , 11 1.03 *** Cook's Distance for a Subset *** Cases: 5 Cook's Distance: 0.06 Cases: 8 , 19 Cook's Distance: 0.33 Cases: 6 , 8 , 19 Cook's Distance: 1.99 Cases: 4 , 8 , 19 Cook's Distance: 0.49 Cases: 4 , 6 , 8 , 19 Cook's Distance: 53.93
Graph output not shown.
Data set size
The data set size limit for computing Cook's Distance is 60 and 30 for case pairs and triples respectively. The data set size limit for case subset computations is 500. You may change the case pairs and triples limits within the macro. To change the limits, go to the section in the macro code labeled "MSE check, triple, nopair" and change 30 and 60 to the sizes you want. Note that computing time increases as the data set size increases, especially for computing all triples.
Inverse does not exist
If analyzing a mixture model, you must specify the noconstant subcommand. If you do not, you will get an error message indicating that the inverse of the XTX matrix does not exist. Usually, if any predictors are (nearly) perfectly correlated you will get this error message.
Missing values
The macro handles missing data by removing rows that have missing data in them. This is shown in the output and in the graphs.
References
Rousseeuw, P. J. and Leroy, A. M. (1987), Robust Regression & Outlier Detection, John Wiley & Sons, Inc.
Seaver, B., Triantis, K., and Reeves, C. (1999), The Identification of Influential Subsets in Regression Using a Fuzzy Clustering Strategy, Technometrics, 41, 340-351.