Two conditions exist that prevent the convergence of the maximum likelihood estimates for the coefficients: complete separation and quasi-complete separation.
Complete separation occurs when a linear combination of the predictors yield a perfect prediction of the response variable. For example, in the following data set if X ≤ 4 then Y = 0. If X > 4 then Y = 1.
Quasi-complete separation is similar to complete separation. The predictors yield a perfect prediction of the response variable for most values of the predictors, but not all. For example, in the previous data set, for one of the values where X = 4, let Y = 1 instead of 0. Now, if X < 4 then Y = 0, if X > 4 then Y = 1, but if X = 4 then Y could be 0 or 1. This overlap in the middle range of the data makes the separation quasi-complete.
Often, separation occurs when the data set is too small to observe events with low probabilities. The more predictors are in the model, the more likely separation is to occur because the individual groups in the data have smaller sample sizes. In Minitab, the model can also fail to converge for very large or very small probabilities that are not strictly 0 or 1, such as less than 1 out of 1 trillion.
Although Minitab prints a warning when it detects separation, the more predictors are in the model the more difficult the identification of the cause of the separation is. The inclusion of interaction terms in the model makes the difficulty even greater.
|Categories of length||Events||Trials|
For more information about separation, please refer to Albert and J. A. Anderson (1984) "On the existence of maximum likelihood estimates in logistic regression models" Biometrika 71, 1, 1–10.