- In Groups, enter the column that indicates the group that each observation belongs to. The column may be numeric, text, or date/time, and may contain up to 20 groups.
###### Note

If you want to change the order in which text groups are processed from their default alphabetized order, you can define your own order. For more information, go to Change the display order of text values in Minitab output.

- In Predictors, enter the column or columns that contain the numeric measurement variables that will be used to determine how the groups differ.

In this worksheet, Track is the grouping column, and indicates the current educational track into which a school administrator has placed each student. Test Score and Motivation are the predictor variables used to determine the educational track of each student.

C1 | C2 | C3 |
---|---|---|

Track | Test Score | Motivation |

3 | 1021 | 44 |

2 | 1152 | 56 |

1 | 1224 | 61 |

3 | 1077 | 46 |

2 | 1149 | 55 |

2 | 1192 | 49 |

Select the discriminant function to use for the analysis.

- Linear: Perform a linear discriminant analysis if you can assume that the groups have the same covariance matrix. For more information, go to What is linear discriminant analysis?.
- Quadratic: Perform a quadratic discriminant analysis if you cannot assume the groups have the same covariance matrix. For more information, go to What is quadratic discriminant analysis?.

You may want to run the analysis twice, using each discriminant function, and then compare the results to determine which function works best for your data. A common method to evaluate the discriminant function is to compare the proportion of correct classifications. Another method is to treat some observations that have known groups as if the groups are unknown and then determine how well the discriminant function predicts the known groups.

Select this option to compensate for an optimistic apparent error rate of misclassified observations. The apparent error rate is the percentage of misclassified observations. This number tends to be optimistic because the data being classified are the same data used to build the classification function.

With cross validation, Minitab omits each observation one at a time and calculates the discriminant function with the remaining observations. Then Minitab predicts the group for the omitted observation. If the proportion of correct groups is high, then you can have confidence in the predictions.

If you use cross-validation, Minitab displays an additional summary table and adds cross-validation information to the Summary of Misclassified Observations table.

Another way to calculate a more realistic error rate is to split your data into two parts. Use one part to create the discriminant function, and the other part as a validation set. Predict group membership for the validation set and calculate the error rate as the percentage of these data that are misclassified.

You can save results from your analysis to the worksheet so that you can use them in other analyses, graphs, and macros. Minitab stores the selected results in the column that you enter. The names of the storage columns for the fits and cross-validated fits end with a number that increases as you store those results multiple times.

- Linear discriminant function
- Enter columns to store the coefficients from the linear discriminant function. Enter one column for each group. Minitab stores one column for each function and one row for each coefficient. The constant is stored in the first row of each column.
- Fits
- Select to store the fitted values. The fitted value for an observation is the group into which it is classified. Minitab stores the group identifiers in the FITS1 column, with one row for each observation in the data.
- Fits from cross validation
- Select to store the fitted values when the discrimination is performed using cross-validation. The fitted value for an observation is the group into which it is classified. Minitab stores the group identifiers for cross-validation in the XFIT1 column, with one row for each observation in the data.