Averages from a grouped table
To collect dataValues, typically letters or numbers. together in an efficient way, it is sometimes necessary to group data. This allows for fewer classes and categories of data to be used, making the table easier to understand. This is especially true for continuous dataContinuous data is obtained by measuring, using an instrument or a thermometer, and cannot be exact. Examples of continuous data are the length of a room and the weight of an apple. which would all be unique values.
Inequalities are often used in grouped tables. For example, the category \(0 \textless m \leq 4\) means that \(m\) is larger than 0 but less than or equal to 4.
If data is organised into groups, we do not know the exact value of each item of data, just which group it belongs to. This means that we cannot find the exact value for the modeAn average found by selecting the most commonly occurring value. There can be more than one mode and there can also be no mode., medianThe median is the middle value. or meanThe total of the numbers divided by how many numbers there are.. We can find the modal group and identify the group that contains the median. We can find an estimate for the mean by using the mid-values of the groups.
Example
The table below shows the number of minutes late some trains left a train station.
Number of minutes late (\(m\)) | Frequency (\(f\)) |
\(0 \textless m \leq 4\) | 11 |
\(4 \textless m \leq 8\) | 13 |
\(8 \textless m \leq 12\) | 7 |
\(12 \textless m \leq 16\) | 9 |
\(16 \textless m \leq 20\) | 4 |
Number of minutes late (\(m\)) | \(0 \textless m \leq 4\) |
---|---|
Frequency (\(f\)) | 11 |
Number of minutes late (\(m\)) | \(4 \textless m \leq 8\) |
---|---|
Frequency (\(f\)) | 13 |
Number of minutes late (\(m\)) | \(8 \textless m \leq 12\) |
---|---|
Frequency (\(f\)) | 7 |
Number of minutes late (\(m\)) | \(12 \textless m \leq 16\) |
---|---|
Frequency (\(f\)) | 9 |
Number of minutes late (\(m\)) | \(16 \textless m \leq 20\) |
---|---|
Frequency (\(f\)) | 4 |
1. Find the modal group for the number of minutes late.
The modal group is the group with the highest frequency. The group with the highest frequency is \(4 \textless m \leq 8\) which occurs 13 times.
The modal group for the number of minutes late is \(4 \textless m \leq 8\).
2. Find the group that contains the median number of minutes late.
Number of minutes late (\(m\)) | Frequency (\(f\)) |
\(0 \textless m \leq 4\) | 11 |
\(4 \textless m \leq 8\) | 13 |
\(8 \textless m \leq 12\) | 7 |
\(12 \textless m \leq 16\) | 9 |
\(16 \textless m \leq 20\) | 4 |
\( Total = 44\) |
Number of minutes late (\(m\)) | \(0 \textless m \leq 4\) |
---|---|
Frequency (\(f\)) | 11 |
Number of minutes late (\(m\)) | \(4 \textless m \leq 8\) |
---|---|
Frequency (\(f\)) | 13 |
Number of minutes late (\(m\)) | \(8 \textless m \leq 12\) |
---|---|
Frequency (\(f\)) | 7 |
Number of minutes late (\(m\)) | \(12 \textless m \leq 16\) |
---|---|
Frequency (\(f\)) | 9 |
Number of minutes late (\(m\)) | \(16 \textless m \leq 20\) |
---|---|
Frequency (\(f\)) | 4 |
Number of minutes late (\(m\)) | |
---|---|
Frequency (\(f\)) | \( Total = 44\) |
To find the median, add up the frequency column to find how many trains there were in total.
There were 44 trains in total in this grouped frequency table, so work out \(\frac{44 + 1}{2} = \frac{45}{2}\) = 22.5. The median is therefore between the 22nd and 23rd values.
Work down the frequency column, adding up frequencies as you go. The 22nd and 23rd values are both in the \(4 \textless m \leq 8\) group.
The group that contains the median number of minutes late is \(4 \textless m \leq 8\).
Number of minutes late (\(m\)) | Frequency (\(f\)) |
\(0 \textless m \leq 4\) | 11 |
\(4 \textless m \leq 8\) | 13 |
\(8 \textless m \leq 12\) | 7 |
\(12 \textless m \leq 16\) | 9 |
\(16 \textless m \leq 20\) | 4 |
\( Total = 44\) |
Number of minutes late (\(m\)) | \(0 \textless m \leq 4\) |
---|---|
Frequency (\(f\)) | 11 |
Number of minutes late (\(m\)) | \(4 \textless m \leq 8\) |
---|---|
Frequency (\(f\)) | 13 |
Number of minutes late (\(m\)) | \(8 \textless m \leq 12\) |
---|---|
Frequency (\(f\)) | 7 |
Number of minutes late (\(m\)) | \(12 \textless m \leq 16\) |
---|---|
Frequency (\(f\)) | 9 |
Number of minutes late (\(m\)) | \(16 \textless m \leq 20\) |
---|---|
Frequency (\(f\)) | 4 |
Number of minutes late (\(m\)) | |
---|---|
Frequency (\(f\)) | \( Total = 44\) |
Estimate the mean number of minutes late.
In grouped tables the exact number of minutes late cannot be found. As the data has been grouped, the exact data values are not known so only an estimate of the mean can be found.
To find the mean number in this frequency table, divide the total number of minutes late by the total number of trains.
To estimate the number of minutes late for each group, create a midpoint column. To find midpoints, add the start and end points and then divide by 2. The midpoint of 0 and 4 is 2, because \(\frac{0+4}{2} = \frac{4}{2} = 2\).
We don鈥檛 know the exact value of each of the 11 items of data in the group \(0 \textless m \leq 4\) so the best estimate we can make is that each item of data was equal to the midpoint, 2. Repeat this process to find the midpoint of each group.
Number of minutes late (\(m\)) | Frequency (\(f\)) | Midpoint (\(x\)) |
\(0 \textless m \leq 4\) | 11 | 2 |
\(4 \textless m \leq 8\) | 13 | 6 |
\(8 \textless m \leq 12\) | 7 | 10 |
\(12 \textless m \leq 16\) | 9 | 14 |
\(16 \textless m \leq 20\) | 4 | 18 |
\(Total = 44\) |
Number of minutes late (\(m\)) | \(0 \textless m \leq 4\) |
---|---|
Frequency (\(f\)) | 11 |
Midpoint (\(x\)) | 2 |
Number of minutes late (\(m\)) | \(4 \textless m \leq 8\) |
---|---|
Frequency (\(f\)) | 13 |
Midpoint (\(x\)) | 6 |
Number of minutes late (\(m\)) | \(8 \textless m \leq 12\) |
---|---|
Frequency (\(f\)) | 7 |
Midpoint (\(x\)) | 10 |
Number of minutes late (\(m\)) | \(12 \textless m \leq 16\) |
---|---|
Frequency (\(f\)) | 9 |
Midpoint (\(x\)) | 14 |
Number of minutes late (\(m\)) | \(16 \textless m \leq 20\) |
---|---|
Frequency (\(f\)) | 4 |
Midpoint (\(x\)) | 18 |
Number of minutes late (\(m\)) | |
---|---|
Frequency (\(f\)) | \(Total = 44\) |
Midpoint (\(x\)) |
Now an estimate for the number of minutes late is known, the total number of minutes late can be found by multiplying the frequencies by the midpoints.
Number of minutes late (\(m\)) | Frequency (\(f\)) | Midpoint (\(x\)) | Total minutes late (\(fx\)) |
\(0 \textless m \leq 4\) | 11 | 2 | \(11 \times 2 = 22\) |
\(4 \textless m \leq 8\) | 13 | 6 | \(13 \times 6 = 78\) |
\(8 \textless m \leq 12\) | 7 | 10 | \(7 \times 10 = 70\) |
\(12 \textless m \leq 16\) | 9 | 14 | \(9 \times 14 = 126\) |
\(16 \textless m \leq 20\) | 4 | 18 | \(4 \times 18 = 72\) |
\(Total = 44\) | \(Total = 368\) |
Number of minutes late (\(m\)) | \(0 \textless m \leq 4\) |
---|---|
Frequency (\(f\)) | 11 |
Midpoint (\(x\)) | 2 |
Total minutes late (\(fx\)) | \(11 \times 2 = 22\) |
Number of minutes late (\(m\)) | \(4 \textless m \leq 8\) |
---|---|
Frequency (\(f\)) | 13 |
Midpoint (\(x\)) | 6 |
Total minutes late (\(fx\)) | \(13 \times 6 = 78\) |
Number of minutes late (\(m\)) | \(8 \textless m \leq 12\) |
---|---|
Frequency (\(f\)) | 7 |
Midpoint (\(x\)) | 10 |
Total minutes late (\(fx\)) | \(7 \times 10 = 70\) |
Number of minutes late (\(m\)) | \(12 \textless m \leq 16\) |
---|---|
Frequency (\(f\)) | 9 |
Midpoint (\(x\)) | 14 |
Total minutes late (\(fx\)) | \(9 \times 14 = 126\) |
Number of minutes late (\(m\)) | \(16 \textless m \leq 20\) |
---|---|
Frequency (\(f\)) | 4 |
Midpoint (\(x\)) | 18 |
Total minutes late (\(fx\)) | \(4 \times 18 = 72\) |
Number of minutes late (\(m\)) | |
---|---|
Frequency (\(f\)) | \(Total = 44\) |
Midpoint (\(x\)) | |
Total minutes late (\(fx\)) | \(Total = 368\) |
Now find the estimate for the mean by dividing the total minutes late by the total number of trains.
\(mean \approx \frac{368}{44} = 8.36~(2~dp)\)