The pooled standard deviation is a method for estimating a single standard deviation to represent all independent samples or groups in your study when they are assumed to come from populations with a common standard deviation. The pooled standard deviation is the average spread of all data points about their group mean (not the overall mean). It is a weighted average of each group's standard deviation. The weighting gives larger groups a proportionally greater effect on the overall estimate. Pooled standard deviations are used in 2-sample t-tests, ANOVAs, control charts, and capability analysis.
Group | Mean | Standard Deviation | N |
---|---|---|---|
1 | 9.7 | 2.5 | 50 |
2 | 12.1 | 2.9 | 50 |
3 | 14.5 | 3.2 | 50 |
4 | 17.3 | 6.8 | 200 |
The first three groups are equal in size (n=50) with standard deviations around 3. The fourth group is much larger (n=200) and has a higher standard deviation (6.8). Because the pooled standard deviation uses a weighted average, its value (5.486) is closer to the standard deviation of the largest group. If you used a simple average, then all groups would have had an equal effect.
Suppose C1 contains the response, and C3 contains the mean for each factor level. For example:
C1 | C2 | C3 |
---|---|---|
Response | Factor | Mean |
18.95 | 1 | 14.5033 |
12.62 | 1 | 14.5033 |
11.94 | 1 | 14.5033 |
14.42 | 2 | 10.5567 |
10.06 | 2 | 10.5567 |
7.19 | 2 | 10.5567 |
Use
with the following expression:SQRT((SUM((C1 - C3)^2)) / (total number of observations - number of groups))
For the previous example, the expression for pooled standard deviation would be:
SQRT((SUM(('Response' - 'Mean')^2)) / (6 - 2))
The value that Minitab stores is 3.75489.