Data means are the raw response variable means for each factor level combination whereas fitted means use least squares to predict the mean response values of a balanced design. Therefore, the two types of means are identical for balanced designs but can be different for unbalanced designs.
Fitted means are useful for assessing response differences due to changes in factor levels rather than differences due to the unbalanced experimental conditions. While you can use raw data with unbalanced designs to obtain a general idea of which main effects may be evident, it is generally good practice to use the fitted means to obtain more precise results.
For example, you are investigating how time and temperature affect the yield of a chemical reaction. The two factors each have two levels producing four experimental conditions. This is an exaggerated unbalanced experiment to emphasize the difference between the two types of means. All experimental conditions are measured twice except for the time and temperature combination of 50 and 200 which is measured four times. The following tables summarize the designed experiment and results.
Temp 150 | Temp 200 | |
---|---|---|
Time 20 | 2 | 2 |
Time 50 | 2 | 4 |
Data Means | Fitted Means | |
---|---|---|
Time 20 | 44.01 | 44.03 |
Time 50 | 47.63 | 47.02 |
Temp 150 | 44.13 | 44.14 |
Temp 200 | 47.55 | 46.90 |
The "Time 20" and "Temp 150" data means and fitted means are virtually identical because all experimental conditions involving either one or both of these factor levels are measured exactly twice (top table). However, the combination "Time 50" and "Temp 200" is measured four times which over represents their effects in the raw data means. The fitted means adjust for this and predict what a balanced design would yield.