Weighted regression

Weighted least squares regression is a method for dealing with observations that have nonconstant variances. If the variances are not constant, observations with large variances should be given relatively small weights, and observations with small variances should be given relatively large weights.

What is weighted regression?

Weighted regression is a method that you can use when the least squares assumption of constant variance in the residuals is violated (heteroscedasticity). With the correct weight, this procedure minimizes the sum of weighted squared residuals to produce residuals with a constant variance (homoscedasticity).

Important

Weighted regression is not an appropriate solution if the heteroscedasticity is caused by an omitted variable.

About choosing the weight to use

Determining the correct weight to use can be a challenging task. The ideal weight is the reciprocal of the variance of the error. However, this is usually incalculable and other approaches must be used. Alternative approaches include using:
  • The reciprocal of a predictor or squared predictor if the variance is proportional to a predictor. Use experience combined with trial and error to determine what works.
  • Values based on theory, the literature, or previous research.

Usually, observations with small variances should have relatively large weights and observations with large variances should have relatively small weights.

Suppose your regression model predicts the annual number of traffic accidents in different cities. Because more populous cities tend to have more accidents, the residuals for larger cities also tend to be larger. One approach for resolving this is to use the reciprocal of each city's population for the weight.

Weights do not affect the degrees of freedom

Specifying a column of weights does not affect the degrees of freedom, unless you specify a weight of zero for one or more observations. Giving an observation a weight of zero removes it from the analysis and thus decreases your degrees of freedom.

Specifying a column of weights effects the sums of squares and parameter estimates in the following ways:
  • The sums of squares become weighted sums of squares.
  • A weighted mean is used in the total sum of squares.
  • A weighted least squares criterion is used to estimate the parameters.

Create a fitted line plot for weighted linear regression

The graph created with the following steps will not contain the regression equation, s, R-squared, and adjusted R-squared (adj) as the Fitted Line Plot created with Stat > Regression > Fitted Line Plot does. However, Minitab prints this information in the output, and you can copy and paste it onto the graph.

Suppose the responses are in C1, predictors are in C2, and weights are in C3:

  1. Choose Stat > Regression > Regression > Fit Regression Model
  2. In Responses, enter C1. In Continuous predictors, enter C2.
  3. Click Options.
  4. In Weights, enter C3. Click OK.
  5. Click Storage.
  6. Check Fits.
  7. Click OK in each dialog box.
  8. Choose Graph > Scatterplot.
  9. Click Simple. Click OK.
  10. In Y variables, enter C1.
  11. In X variables, enter C2. Click OK.
  12. Right-click the scatterplot and choose Add > Calculated Line.
  13. In Y column, enter the column of fits (usually named FITS1).
  14. In X column, enter C2. Click OK.

Edit line color and graph title

You can change the color of the line. To make the line blue, double-click it. On the Attributes tab, under Lines, choose Custom, and choose blue from the Color drop-down list. Click OK.

You can also change the title. Double-click the title. On the Font tab, under Text, type the title that you want. Click OK.