What are the least squares and the maximum likelihood estimation methods?

Two commonly used approaches to estimate population parameters from a random sample are the maximum likelihood estimation method (default) and the least squares estimation method.
Maximum likelihood estimation method (MLE)
The likelihood function indicates how likely the observed sample is as a function of possible parameter values. Therefore, maximizing the likelihood function determines the parameters that are most likely to produce the observed data. From a statistical point of view, MLE is usually recommended for large samples because it is versatile, applicable to most models and different types of data, and produces the most precise estimates.
Least squares estimation method (LSE)
Least squares estimates are calculated by fitting a regression line to the points from a data set that has the minimal sum of the deviations squared (least square error). In reliability analysis, the line and the data are plotted on a probability plot.

Why is MLE the default method in Minitab?

For large, complete data sets, both the LSE method and the MLE method provide consistent results. In reliability applications, data sets are typically small or moderate in size. Extensive simulation studies show that in small sample designs where there are only a few failures, the MLE method is better than the LSE method.1 Thus, the default estimation method in Minitab is MLE.

The advantages of the MLE method over the LSE method are as follows:
  • The distribution parameter estimates are more precise.
  • The estimated variance is smaller.
  • Confidence intervals and tests for model parameters can be reliably calculated.
  • The calculations use more of the information in the data.

    When there are only a few failures because the data are heavily censored, the MLE method uses the information in the entire data set, including the censored values. The LSE method ignores the information in the censored observations.1

Usually, the advantages of the MLE method outweigh the advantages of the LSE method. The LSE method is easier to calculate by hand and easier to program. The LSE method is also traditionally associated with the use of probability plots to assess goodness-of-fit. However, the LSE method can provide misleading results on a probability plot. Examples exist where the points on a Weibull probability plot that uses the LSE method fall along a line when the Weibull model is actually inappropriate.1

1. Genschel, U. and Meeker, W.Q. (2010). A Comparison of Maximum Likelihood and Median-Rank Regression for Weibull Estimation. Quality Engineering, 22(4): 236–255.

Why aren't confidence intervals and tests for model parameters available with the LSE method?

In earlier releases, Minitab provided calculated results for standard errors, confidence intervals, and tests for model parameters when using the LSE method. These calculated results were based on an ad-hoc method. However, there is no established, accepted statistical method for calculating standard errors for model parameters using the LSE method. Therefore, if you change the default method of estimation and select Least Squares (failure time(X) on rank(Y)), your output will not include calculated results for standard errors, confidence intervals, and tests for the model parameters. If you want to include confidence intervals and tests for model parameters in your results, you must use the MLE (default) method.

Change the estimation method

To change the parameter estimation method from MLE to LSE when using a parametric distribution analysis, a distribution ID plot, or a distribution overview plot, do the following:

  1. Choose Stat > Reliability/Survival > Distribution Analysis (Right Censoring) or Distribution Analysis (Arbitrary Censoring).
  2. Choose one of the following analyses and click the appropriate button:
    Analysis Button
    Parametric Distribution Analysis Estimate
    Distribution ID Plot Options
    Distribution Overview Plot Options
  3. Choose Least Squares (failure time(X) on rank(Y)).

    If you use the least squares estimation method, estimates are calculated by fitting a regression line to the points in a probability plot. The line is formed by regressing time to failure or log (time to failure) (X) on the transformed percent (Y).

Because the percentiles of the distribution are based on the estimated distribution parameters, differences in the estimated parameters will cause differences in estimated percentiles.

Enter starting values or change the maximum number of iterations for maximum likelihood estimation

When you estimate the parameters using the maximum likelihood estimation method, you can specify starting values for the algorithm and specify the maximum number of iterations.

  1. In the worksheet, enter parameter estimates for the distribution in a single column in the worksheet.
    The maximum likelihood solution may not converge if the starting estimates are not in the neighborhood of the true solution; therefore, you should enter approximate starting values for parameter estimates. For the different distributions, enter the parameter estimates in the worksheet in the order that this table indicates.
    Distribution Parameters
    Weibull Enter shape and scale
    Exponential Enter mean
    Other 2-parameter distributions Enter location and scale
    2-parameter exponential Enter scale and threshold
    3-parameter Weibull Enter shape, scale, and threshold
    Other 3-parameter distributions Enter location, scale, and threshold
  2. Choose Stat > Reliability/Survival > Distribution Analysis (Right Censoring) > Parametric Distribution Analysis or Stat > Reliability/Survival > Distribution Analysis (Arbitrary Censoring) > Parametric Distribution Analysis.
  3. Click Options.
  4. In Use starting estimates, enter the column of starting values for the algorithm.
  5. In Maximum number of iterations, enter the maximum number of iterations for reaching convergence (the default is 20).

    Minitab obtains maximum likelihood estimates through an iterative process. If the maximum number of iterations is obtained before convergence, the algorithm stops.

Specify parameters for a parametric distribution analysis instead having Minitab estimate the parameters

You can use either of the estimation methods in Parametric Distribution Analysis (Right Censoring) and Parametric Distribution Analysis (Arbitrary Censoring). However, instead of having Minitab estimate the parameters using one of these methods, you can also specify some parameters or all the parameters. If you choose to specify parameters, the calculated results—such as the percentiles—are based on the values of the parameters that you entered for the analysis.

Specify some parameters and estimate others

You can specify some of the parameters for your distribution and have Minitab estimate the others from the data. Usually, you estimate some parameters to perform a Bayes Analysis when the data have few or no failures. See How to perform a reliability analysis with few or no failures for more details.

  1. Choose Stat > Reliability/Survival > Distribution Analysis (Right Censoring) > Parametric Distribution Analysis or Stat > Reliability/Survival > Distribution Analysis (Arbitrary Censoring) > Parametric Distribution Analysis.
  2. Click Estimate.
  3. In Bayes Analysis, enter the parameters that you want to specify for your distribution. The parameters that you can specify depend on the distribution that you choose:
    Distribution Parameters that you can specify
    Weibull Shape
    3-parameter Weibull Shape, threshold, or both
    Exponential None
    2-parameter Exponential Threshold
    Other distributions without a threshold Scale
    Other distributions with a threshold Scale, threshold, or both

    You always estimate the scale parameter for the Weibull distribution. For distributions that have a location parameter, you always estimate the location parameter.

Specify all parameters

You can specify all of the parameters instead of estimating them from the data. You can specify historical parameters to do things like compare the estimates you based on historical parameters to estimates based on the current data or see how the current data fit a probability plot based on the historical parameters.

  1. In the worksheet, enter parameter estimates for the distribution in a single column. You can enter more than one column of parameter estimates if you have more than one variable to analyze. For the different distributions, enter the parameter estimates in the column in the order that the table indicates.
    Distribution Parameters
    Weibull Enter shape and scale
    Exponential Enter mean
    Other 2-parameter distributions Enter location and scale
    2-parameter exponential Enter scale and threshold
    3-parameter Weibull Enter shape, scale, and threshold
    Other 3-parameter distributions Enter location, scale, and threshold
  2. Choose Stat > Reliability/Survival > Distribution Analysis (Right Censoring) > Parametric Distribution Analysis or Stat > Reliability/Survival > Distribution Analysis (Arbitrary Censoring) > Parametric Distribution Analysis.
  3. Click Options.
  4. Select Use historical estimates
  5. In Use historical estimates, enter the column of estimated parameters. If you have more than one variable to analyze, enter the columns of estimates in the same order that you entered the variables.

Assume common shape or scale parameters for parametric distribution analysis

When you perform parametric distribution analysis, you can have Minitab assume common shape or scale parameters for the estimates.

  1. Choose Stat > Reliability/Survival > Distribution Analysis (Right Censoring) > Parametric Distribution Analysis or Stat > Reliability/Survival > Distribution Analysis (Arbitrary Censoring) > Parametric Distribution Analysis.
  2. Click Estimate.
  3. Under Estimation Method, check Assume common shape (slope-Weibull) or scale (1/slope-other dists).

    Minitab then assumes common shape or scale parameters when calculating the estimates. For example, suppose that you have 2 (or more generally k>2) independent normally distributed samples with different means but the same variance. To estimate the mean for each sample Minitab uses a pooled estimate of the variance. This approach is generalized to other distributions as well. The specific result, however, depends on the estimation method that you have selected for the analysis.

MLE method with common shape or scale parameters

For the maximum likelihood method, Minitab uses the log likelihood function. In this case, the log likelihood function of the model is the sum of the individual log likelihood functions, with the same shape parameter assumed in each individual log likelihood function. The resulting overall log likelihood function is maximized to obtain the scale parameters associated with each group and the common shape parameter. For more information, see the following reference: W. Nelson (1982). Applied Life Data Analysis, Chapter 12. John Wiley & Sons.

LSE method with common shape or scale parameters

Minitab first calculates the y-coordinate and x-coordinate for each group (for details, see the "Plot points" and "Fitted line" topics in Methods and formulas for probability plot in Parametric Distribution Analysis (Right Censoring). Then, to obtain the LSE estimates, Minitab performs the following steps:

  1. Pools the x-coordinate data.
  2. Pools the y-coordinate data.
  3. Uses an indicator variable (or By variable) to identify the groups.
  4. Regresses the x-coordinates (response) against the predictors defined by all the y-coordinates (continuous predictor) and the indicator variable (categorical predictor).
    Note

    For log-location-scale distributions (for example, Weibull), the x-coordinates must be log-transformed. The groups should have the same slope, which is the inverse of the common shape parameter. The scale parameter of each group is obtained by exponentiation of the intercept for each group.