The squared distance (also called the Mahalanobis distance) of observation x to the center (mean) of group t for linear discriminant is given by the following general form:

The squared Mahalanobis distance from x to group t for the quadratic discriminant function is calculated as follows:

The generalized squared distance from x to group t for the linear discriminant function is calculated as follows:

The generalized squared distance from x to group t for the quadratic discriminant function is calculated as follows:

The posterior probability for x belonging to group t is calculated as follows:

The linear discriminant scores are calculated as follows:

Term | Description |
---|---|

x | column vector of length p containing the values of the predictors for this observation (this column vector is stored as one row) |

p | number of predictors |

n | total number of observations |

t | group subscript |

n_{t} | number of observations in group t |

q_{t} | the prior probability of group t , which equals n_{t}/n |

S_{p} | pooled covariance matrix for linear discriminant analysis |

S_{i} | covariance matrix of group i for quadratic discriminant analysis |

m_{t} | column vector of length p containing the means of the predictors calculated from the data in group t |

S_{t} | covariance matrix of group t |

|S_{t}| | determinant of S_{t } |

The linear discriminant function corresponds to the regression coefficients in multiple regression and is calculated as follows:

For a given **x**, this rule allocates **x** to the group with largest linear discriminant function.

Term | Description |
---|---|

x | column vector of length p containing the values of the predictors for this observation (this column vector is stored as one row) |

m_{i} | column vector of length p containing the means of the predictors calculated from the data in group i |

S_{p} | pooled covariance matrix |

ln p_{i} | natural log of the prior probability for group i |

The generalized squared distance is used as the quadratic distance measure and is calculated as follows:

Term | Description |
---|---|

x | column vector of length p containing the values of the predictors for this observation (this column vector is stored as one row) |

m_{i} | column vector of length p containing the means of the predictors calculated from the data in group i |

S_{p} | pooled covariance matrix f |

ln p_{i} | natural log of the prior probability for group i |

The posterior probability is the probability of group i given the data and is calculated as follows:

The largest posterior probability is equivalent to the largest value of ln [p_{i} f_{i }(x)]

where (if the distribution is normal):

and

Term | Description |
---|---|

p_{i} | prior probability of group i |

f_{i}(x) | the joint density for the data in group i (with the population parameters replaced by the sample estimates) |