Comparison of data means and fitted means

Data means are the raw response variable means for each factor level combination whereas fitted means use least squares to predict the mean response values of a balanced design. Therefore, the two types of means are identical for balanced designs but can be different for unbalanced designs.

Fitted means are useful for assessing response differences due to changes in factor levels rather than differences due to the unbalanced experimental conditions. While you can use raw data with unbalanced designs to obtain a general idea of which main effects may be evident, it is generally good practice to use the fitted means to obtain more precise results.

Example of data means and fitted means

For example, you are investigating how time and temperature affect the yield of a chemical reaction. The two factors each have two levels producing four experimental conditions. This is an exaggerated unbalanced experiment to emphasize the difference between the two types of means. All experimental conditions are measured twice except for the time and temperature combination of 50 and 200 which is measured four times. The following tables summarize the designed experiment and results.

Table 1. Number of Observations per Experimental Condition
  Temp 150 Temp 200
Time 20 2 2
Time 50 2 4
Table 2. Means by Factor Level
  Data Means Fitted Means
Time 20 44.01 44.03
Time 50 47.63 47.02
Temp 150 44.13 44.14
Temp 200 47.55 46.90

The "Time 20" and "Temp 150" data means and fitted means are virtually identical because all experimental conditions involving either one or both of these factor levels are measured exactly twice (top table). However, the combination "Time 50" and "Temp 200" is measured four times which over represents their effects in the raw data means. The fitted means adjust for this and predict what a balanced design would yield.