Standard deviation
Standard deviation is an important measure of spread or dispersion.
It tells us how far, on average the results are from the mean.
Therefore if the standard deviation is small, then this tells us that the results are close to the mean, whereas if the standard deviation is large, then the results are more spread out.
For example, the following two data sets are significantly different in nature and yet have the same mean, median and range.
Some sort of numerical measure which distinguishes between them would be useful.
Data set 1 | 1 | 7 | 12 | 15 | 20 | 22 | 28 |
Data set 2 | 1 | 15 | 15 | 15 | 15 | 16 | 28 |
Data set 1 |
---|
1 |
7 |
12 |
15 |
20 |
22 |
28 |
Data set 2 |
---|
1 |
15 |
15 |
15 |
15 |
16 |
28 |
Most of the results in data set 2 are close to the mean, whereas the results in data set 1 are further from the mean in comparison.
This suggests that the standard deviation is smaller in data set 2 than data set 1.
When comparing distributions, it is better to use a measure of spread or dispersion (such as standard deviation or semi-interquartile range) in addition to a measure of central tendency (such as mean, median or mode).
There are two formulae for calculating the standard deviation, however the most commonly used formula to calculate the standard deviation is:
\(SD = \sqrt {\frac{{\sum {{(X - \bar X)}^2}}}{{n - 1}}}\)
Where \(\sum\) means 'sum of'
\({\bar X}\) is the 'mean'
\(n\) is the number of data in the sample
Use this information to try the example questions below.
Question
Find the mean and standard deviation of the following numbers: \(4,\,7,\,9,\,11,\,13,\,15,\,18\)
\(mean = \bar X = \frac{{(4 + 7 + 9 + 11 + 13 + 15 + 18)}}{7} = 11\)
\(SD = \sqrt {\frac{{\sum {{(X - \bar X)}^2}}}{{n - 1}}}\)
In order to get the value of \(\sum {(X - \bar X)^2}\), we use a table as shown below:
\(x\) | \(X - \bar X\) | \({(X - \bar X)^2}\) |
4 | \(4 - 11 = - 7\) | \({( - 7)^2} = 49\) |
7 | \(7 - 11 = - 4\) | \({( - 4)^2} = 16\) |
9 | \(9 - 11 = - 2\) | \({( - 2)^2} = 4\) |
11 | \(11 - 11 = 0\) | \({(0)^2} = 0\) |
13 | \(13 - 11 = 2\) | \({(2)^2} = 4\) |
15 | \(15 - 11 = 4\) | \({(4)^2} = 16\) |
18 | \(18 - 11 = 7\) | \({(7)^2} = 49\) |
\(\sum {(X - \bar X)^2} = 138\) |
\(x\) | 4 |
---|---|
\(X - \bar X\) | \(4 - 11 = - 7\) |
\({(X - \bar X)^2}\) | \({( - 7)^2} = 49\) |
\(x\) | 7 |
---|---|
\(X - \bar X\) | \(7 - 11 = - 4\) |
\({(X - \bar X)^2}\) | \({( - 4)^2} = 16\) |
\(x\) | 9 |
---|---|
\(X - \bar X\) | \(9 - 11 = - 2\) |
\({(X - \bar X)^2}\) | \({( - 2)^2} = 4\) |
\(x\) | 11 |
---|---|
\(X - \bar X\) | \(11 - 11 = 0\) |
\({(X - \bar X)^2}\) | \({(0)^2} = 0\) |
\(x\) | 13 |
---|---|
\(X - \bar X\) | \(13 - 11 = 2\) |
\({(X - \bar X)^2}\) | \({(2)^2} = 4\) |
\(x\) | 15 |
---|---|
\(X - \bar X\) | \(15 - 11 = 4\) |
\({(X - \bar X)^2}\) | \({(4)^2} = 16\) |
\(x\) | 18 |
---|---|
\(X - \bar X\) | \(18 - 11 = 7\) |
\({(X - \bar X)^2}\) | \({(7)^2} = 49\) |
\(x\) | |
---|---|
\(X - \bar X\) | |
\({(X - \bar X)^2}\) | \(\sum {(X - \bar X)^2} = 138\) |
\(SD = \sqrt {\frac{{\sum {{(X - \bar X)}^2}}}{{n - 1}}}\)
\(SD = \sqrt {\frac{{138}}{{7 - 1}}}\)
\(SD = \sqrt {\frac{{138}}{6}}\)
\(SD=\sqrt{23}\)
\(SD = 4.796(to\,3\,d.p.)\)
Question
The weights in kilograms of seven women are shown below:
\(52,\,41,\,58,\,63,\,49,\,50,\,72\)
Calculate the mean and standard deviation of these weights.
(b) The mean and standard deviation of a group of men were found to be 60 and 5.5 respectively.
Make two statements comparing the group of men with the group of women.
(a) \(\bar X = \frac{{52 + 41 + 58 + 63 + 49 + 50 + 72}}{7} = \frac{{385}}{7} = 55\)
\(x\) | \(X - \bar X\) | \({(X - \bar X)^2}\) |
52 | \(52 - 55 = - 3\) | \({( - 3)^2} = 9\) |
41 | \(41 - 55 = - 14\) | \({( - 14)^2} = 196\) |
58 | \(58 - 55 = 3\) | \({( 3)^2} = 9\) |
63 | \(63 - 55 = 8\) | \({(8)^2} = 64\) |
49 | \(49 - 55 = - 6\) | \({(- 6)^2} = 36\) |
50 | \(50 - 55 = - 5\) | \({(- 5)^2} = 25\) |
72 | \(72 - 55 = 17\) | \({(17)^2} = 289\) |
Total | \(628\) |
\(x\) | 52 |
---|---|
\(X - \bar X\) | \(52 - 55 = - 3\) |
\({(X - \bar X)^2}\) | \({( - 3)^2} = 9\) |
\(x\) | 41 |
---|---|
\(X - \bar X\) | \(41 - 55 = - 14\) |
\({(X - \bar X)^2}\) | \({( - 14)^2} = 196\) |
\(x\) | 58 |
---|---|
\(X - \bar X\) | \(58 - 55 = 3\) |
\({(X - \bar X)^2}\) | \({( 3)^2} = 9\) |
\(x\) | 63 |
---|---|
\(X - \bar X\) | \(63 - 55 = 8\) |
\({(X - \bar X)^2}\) | \({(8)^2} = 64\) |
\(x\) | 49 |
---|---|
\(X - \bar X\) | \(49 - 55 = - 6\) |
\({(X - \bar X)^2}\) | \({(- 6)^2} = 36\) |
\(x\) | 50 |
---|---|
\(X - \bar X\) | \(50 - 55 = - 5\) |
\({(X - \bar X)^2}\) | \({(- 5)^2} = 25\) |
\(x\) | 72 |
---|---|
\(X - \bar X\) | \(72 - 55 = 17\) |
\({(X - \bar X)^2}\) | \({(17)^2} = 289\) |
\(x\) | |
---|---|
\(X - \bar X\) | Total |
\({(X - \bar X)^2}\) | \(628\) |
\(SD = \sqrt {\frac{{\sum {{(X - \bar X)}^2}}}{{n - 1}}}\)
\(SD = \sqrt {\frac{{628}}{{7 - 1}}}\)
\(SD = \sqrt {\frac{{628}}{6}}\)
\(SD=\sqrt{104.67}\)
\(SD = 10.2(to\,1\,d.p.)\)
(b) The mean for the women is lower than the men since 55 < 60. This tells us that on averageA value to best represent a set of data. There are three type of average - the mean, the median and the mode. the women are lighter than the men.
The standard deviation for the women is higher than the men since 10.2 > 5.5. This tells us that there is more variation in weight for the women's results than the men's.