Interpret all statistics and graphs for Discriminant Analysis

Find definitions and interpretation guidance for every statistic and graph that is provided with discriminant analysis.

True group

The actual group into which an observation is classified. The true group is determined by the values in the grouping column of the worksheet.

Interpretation

To assess the classification of the observations into each group, compare the groups that the observations were put into with their true groups.

Summary of Classification True Group Put into Group 1 2 3 1 59 5 0 2 1 53 3 3 0 2 57 Total N 60 60 60 N correct 59 53 57 Proportion 0.983 0.883 0.950

Column 2 of this Summary of classification table shows that 53 observations from were correctly assigned to Group 2. However, 5 observations from Group 2 were instead put into Group 1, and 2 observations from Group 2 were put into Group 3. Therefore, 7 of the observations from Group 2 were incorrectly classified into other groups.

Summary of Misclassified Observations True Pred Squared Observation Group Group Group Distance Probability 4** 1 2 1 3.524 0.438 2 3.028 0.562 3 25.579 0.000 65** 2 1 1 2.764 0.677 2 4.244 0.323 3 29.419 0.000 71** 2 1 1 3.357 0.592 2 4.101 0.408 3 27.097 0.000 78** 2 1 1 2.327 0.775 2 4.801 0.225 3 29.695 0.000 79** 2 1 1 1.528 0.891 2 5.732 0.109 3 32.524 0.000 100** 2 1 1 5.016 0.878 2 8.962 0.122 3 38.213 0.000 107** 2 3 1 39.0226 0.000 2 7.3604 0.032 3 0.5249 0.968 116** 2 3 1 31.898 0.000 2 7.913 0.285 3 6.070 0.715 123** 3 2 1 30.164 0.000 2 5.662 0.823 3 8.738 0.177 124** 3 2 1 26.328 0.000 2 4.054 0.918 3 8.887 0.082 125** 3 2 1 28.542 0.000 2 3.059 0.521 3 3.230 0.479

Row 1 of this Summary of Misclassified Observations table shows that observation 4 was predicted to belong to Group 2, but actually belongs to Group 1.

Put into Group

The group into which an observation is predicted to belong to based on the discriminant analysis.

Interpretation

To assess the classification of the observations into each group, compare the groups that the observations were put into with their true groups. For example, row 2 of the following Summary of classification table shows that a total of 1 + 53 + 3 = 57 observations were put into Group 2. Of those 57 observations, 53 observations were correctly assigned to Group 2. However, 1 observation that was put into Group 2 was actually from Group 1, and 3 observations that were put into Group 2 were actually from Group 3. Therefore, 4 of the observations predicted to belong to Group 2 were actually from other groups.

Summary of Classification True Group Put into Group 1 2 3 1 59 5 0 2 1 53 3 3 0 2 57 Total N 60 60 60 N correct 59 53 57 Proportion 0.983 0.883 0.950

Total N

The total number of observations in each true group.

N correct

The number of observations correctly placed into each true group. Minitab displays the N correct for each true group and the total N correct tor all the groups.

Interpretation

Use the N correct value to determine how many observations in your data set are predicted to belong to the group that they have been assigned to. For example, for Group 1, suppose the N correct value is 52 and the Total N value is 60. This indicates that 60 values are identified as belonging to Group 1 based on the values in the grouping column of the worksheet. Of those 60 observations, 52 are predicted to belong to Group 1 based on the discriminant function used for the analysis. Therefore, the number of observations that are correctly placed into each true group is 52.

Proportion

The proportion of observations correctly placed in each true group.

Interpretation

Use the proportion of observations correctly placed in each group to evaluate how well your observations are classified. For example, the proportions in the Summary of classification table indicate the following:

  • 98.3% of the observations in group 1 are correctly placed.
  • 88.3% of the observations in group 2 are correctly placed.
  • 95% of the observations in group 3 are correctly placed.

Therefore, classifying observations into group 2 has the most problems.

Summary of Classification True Group Put into Group 1 2 3 1 59 5 0 2 1 53 3 3 0 2 57 Total N 60 60 60 N correct 59 53 57 Proportion 0.983 0.883 0.950

N

The number of non-missing values in the data set. N equals the total number of observations in all of the groups.

Proportion Correct

The proportion of correct classifications for all groups. This value equals the number of correctly placed observations (N Correct) divided by the total number of observations (N).

Squared Distance Between Groups

The squared distance from one group center (mean) to another group center (mean). An observation is classified into a group if the squared distance (also called the Mahalanobis distance) of the observation to the group center (mean) is the minimum.

Note

If you use the quadratic function, Minitab displays the Generalized Squared Distance table. For more information on how squared distances are calculated for each function, go to Distance and discriminant functions for Discriminant Analysis.

Interpretation

Although the distance values are not very informative by themselves, you can compare the distances to see how different the groups are. For example, the following results indicate that the greatest distance is between groups 1 and 3 (48.0911). The difference between groups 1 and 2 is 12.9853, and the difference between groups 2 and 3 is 11.3197.

Squared Distance Between Groups 1 2 3 1 0.0000 12.9853 48.0911 2 12.9853 0.0000 11.3197 3 48.0911 11.3197 0.0000

Linear Discriminant Function for Groups

The linear discriminant function for groups indicates the linear equation associated with each group. The linear discriminant scores for each group correspond to the regression coefficients in multiple regression analysis.

Interpretation

Use the linear discriminant function for groups to determine how the predictor variables differentiate between the groups. For example, when you have three groups, Minitab estimates a function for discriminating between the following groups:
  • Group 1 and groups 2 and 3
  • Group 2 and groups 1 and 3
  • Group 3 and groups 1 and 2

The groups with the largest linear discriminant function, or regression coefficients, contribute most to the classification of observations. For example, in the following results, group 1 has the largest linear discriminant function (17.4) for test scores, which indicates that test scores for group 1 contribute more than those of group 2 or group 3 to the classification of group membership. Group 3 has the largest linear discriminant function for motivation, which indicates that the motivation scores of group 3 contribute more than those of group 1 or group 2 to the classification of group membership.

Linear Discriminant Function for Groups 1 2 3 Constant -9707.5 -9269.0 -8921.1 Test Score 17.4 17.0 16.7 Motivation -3.2 -3.7 -4.3

Pooled mean

The pooled means is the weighted average of the means of each true group. To display the pooled mean, you must click Options and select Above plus mean, std. dev., and covariance summary when you perform the analysis.

Interpretation

Use the pooled mean to describe the center of all the observations in the data. For example, in the following results, the overall test score mean for all the groups is 1102.1

Group Means Pooled Means for Group Variable Mean 1 2 3 Test Score 1102.1 1127.4 1100.6 1078.3 Motivation 47.056 53.600 47.417 40.150

Means for Group

The sum of the values in each true group divided by the number of (non-missing) values in each true group. To display the means for groups, you must click Options and select Above plus mean, std. dev., and covariance summary when you perform the analysis.

Interpretation

Use group means to describe each true group with a single value that represents the center of the data. For example, in the following results, group 1 has the highest mean test score (1127.4), while group 3 has the lowest mean test score (1078.3). The mean test score for Group 2 is in the middle (1100.6).

Group Means Pooled Means for Group Variable Mean 1 2 3 Test Score 1102.1 1127.4 1100.6 1078.3 Motivation 47.056 53.600 47.417 40.150

Pooled StDev

The pooled standard deviation is a weighted average of the standard deviations of each true group. To display the pooled standard deviation, you must click Options and select Above plus mean, std. dev., and covariance summary when you perform the analysis.

Interpretation

Use the pooled standard deviation to determine how spread out the individual data points are about their true group mean. For example, in the following results, the pooled standard deviation for the test scores for all the groups is 8.109.

Group Standard Deviations Pooled StDev for Group Variable StDev 1 2 3 Test Score 8.109 8.308 9.266 6.511 Motivation 2.994 2.409 3.243 3.251

StDev for Groups

The most common measure of dispersion, or how spread out the data are about the mean. The standard deviation of the groups is the standard deviation of each true group. To display the standard deviations for groups, you must click Options and select Above plus mean, std. dev., and covariance summary when you perform the analysis.

Interpretation

Use the standard deviation for the groups to determine how spread out the data are from the mean in each true group. For example, in the following results, the test scores for group 2 have the highest standard deviation (9.266). This indicates that the test scores for Group 2 have the greatest variability of the three groups. Group 3 has the lowest standard deviation (6.511) and the lowest variability of test scores of the three groups.

Group Standard Deviations Pooled StDev for Group Variable StDev 1 2 3 Test Score 8.109 8.308 9.266 6.511 Motivation 2.994 2.409 3.243 3.251

Pooled Covariance Matrix

A weighted matrix of the relationship between all observations in all groups. The pooled covariance matrix is calculated by averaging the individual group covariance matrices element by element.

To display the pooled covariance matrix, you must click Options and select Above plus mean, std. dev., and covariance summary when you perform the analysis.

Covariance matrix

A nonstandardized matrix that indicates the relationship between each pair of variables. The covariance is similar to the correlation coefficient, which is the covariance divided by the product of the standard deviations of the variables.

To display the covariance matrix for each group, you must click Options and select Above plus mean, std. dev., and covariance summary when you perform the analysis.

Observation

Observation number for each observation. The observation number corresponds to the row of the classified observation in the Minitab worksheet. Minitab displays the symbols ** after the observation number if the observation was misclassified (that is, if the true group differs from the predicted group).

To see the predicted and true group for every observation in your data set, you must click Options and select Above plus complete classification summary when you perform the analysis.

Pred Group

The predicted group for each observation is the group membership that Minitab assigns to the observation based on the predicted squared distance. To see the predicted and true group for each observation in your data set, you must click Options and select Above plus complete classification summary when you perform the analysis.

Interpretation

Compare the predicted group and the true group for each observation to determine whether the observation was classified correctly. If the predicted group differs from the true group, then the observation was misclassified.

X-val Group

The predicted group using cross-validation (X-val) is the group membership that Minitab assigns to the observation based on the predicted squared distance using cross-validation. To see the predicted group using cross-validation for each observation, you must select Use cross validation on the main dialog box, and then click Options and select Above plus complete classification summary, when you perform the analysis.

Interpretation

Compare the predicted group using cross-validation and the true group for each observation to determine whether the observation was classified correctly. If the predicted group using cross-validation differs from the true group, then the observation was misclassified.

Important

The predicted group using cross-validation omits an observation to create the discrimination rule and then sees how well the rule works for that specific observation. When you don't use cross-validation, you bias the discrimination rule by using that observation to create the rule.

Squared Distance

The predicted squared distance values for each observation from each group. The squared distance value indicates how far away an observation is from each group mean. To see the squared distance for each observation in your data, you must click Options and select Above plus complete classification summary when you perform the analysis.

Note

If you use cross-validation when you perform the analysis, Minitab calculates the predicted squared distance for each observation both with cross-validation (X-val) and without cross-validation (Pred). For more information on how the squared distances are calculated, go to Distance and discriminant functions for Discriminant Analysis.

By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy