What is the pooled standard deviation?

The pooled standard deviation is a method for estimating a single standard deviation to represent all independent samples or groups in your study when they are assumed to come from populations with a common standard deviation. The pooled standard deviation is the average spread of all data points about their group mean (not the overall mean). It is a weighted average of each group's standard deviation. The weighting gives larger groups a proportionally greater effect on the overall estimate. Pooled standard deviations are used in 2-sample t-tests, ANOVAs, control charts, and capability analysis.

Example of a pooled standard deviation

Suppose your study has the following four groups:

Group	Mean	Standard Deviation	N
1	9.7	2.5	50
2	12.1	2.9	50
3	14.5	3.2	50
4	17.3	6.8	200

The first three groups are equal in size (n=50) with standard deviations around 3. The fourth group is much larger (n=200) and has a higher standard deviation (6.8). Because the pooled standard deviation uses a weighted average, its value (5.486) is closer to the standard deviation of the largest group. If you used a simple average, then all groups would have had an equal effect.

Manually calculating the pooled standard deviation

Suppose C1 contains the response, and C3 contains the mean for each factor level. For example:

C1	C2	C3
Response	Factor	Mean
18.95	1	14.5033
12.62	1	14.5033
11.94	1	14.5033
14.42	2	10.5567
10.06	2	10.5567
7.19	2	10.5567

Use Calc > Calculator with the following expression:

SQRT((SUM((C1 - C3)^2)) / (total number of observations - number of groups))

For the previous example, the expression for pooled standard deviation would be:

SQRT((SUM(('Response' - 'Mean')^2)) / (6 - 2))

The value that Minitab stores is 3.75489.