5.1.5 Cumulative Frequency Graphs
In this topic we will learn how to:
- draw and interpret cumulative frequency graphs
A cumulative frequency graph is used to represent grouped continuous data. It is represented in the form of an -shaped curve.
Cumulative frequency is the total frequency of the previous classes up to and including the present class.
To draw a cumulative frequency graph you need the following information:
- upper bound of a class
- cumulative frequency
To calculate the cumulative frequency, add the frequencies of previous classes together with that of the current class.A cumulative frequency curve represents grouped data, so we cannot find the mode but instead we can find the modal class. The modal class is the class with the highest frequency.To calculate the -th percentile, we use the formulaWhere represents the percentile and represents the sample size.To calculate the lower quartile, we use the formula,
Where represents the lower quartile.To calculate the upper quartile, we use the formula,
Where represents the upper quartile.
To calculate the interquartile range, we use the formula,
Where represents the interquartile range, represents the upper quartile, represents the lower quartile.
To calculate the mean when data is displayed in the form of a cumulative frequency curve, we need to first find the mid interval. This is the middle value for each value. We use the formula,
Where represents mean, represents the mid-interval, represents the frequency.
To calculate the variance, we use the formula,
Where represents variance, represents the mid interval, represents the frequency, represents mean.
Standard deviation is the square root of variance. Therefore, the formula for standard deviation is,
Where represents standard deviation, represents the mid interval, represents the frequency, represents mean.
Let’s look at some past paper questions.
1. Helen measures the lengths of fish of a certain species in a large pond. These lengths, correct to the nearest centimetre, are summarised, are summarised in the following table. (9709/52/F/M/20 number 7)
Length (cm) | 0 – 9 | 10 – 14 | 15 – 19 | 20 – 30 |
Frequency | 15 | 48 | 66 | 21 |
(a) Draw a cumulative frequency graph to illustrate the data.
Find the cumulative frequency,
Length (cm) | 0 – 9 | 10 – 14 | 15 – 19 | 20 – 30 |
Cumulative Frequency | 15 | 63 | 129 | 150 |
You will notice that there are gaps between the classes. To remove those gaps we need to do continuity correction. Simply subtract from the lower bounds and add to the upper bounds,
Length (cm) | 0 – 9.5 | 9.5 – 14.5 | 14.5 – 19.5 | 19.5 – 30.5 |
Cumulative Frequency | 15 | 63 | 129 | 150 |
Note: We do not subtract from because we would end up with a negative value for length, which does not exist.
Now that there are no gaps, we can plot the upper bounds against the cumulative frequency. Label the -axis with cumulative frequency. Label the -axis with the class title.
(b) of these fish have a length of cm or more. Use your graph to estimate the value of .
This means of fish have a length less than cm. Let’s find of ,
Draw construction lines at a cumulative frequency of and read off the length,
Therefore, the final answer is,
The mean length of these fish is cm.
(c) Calculate an estimate for the variance of the lengths of the fish.
The formula for variance is,
We already have the mean, we need to find . represents the mid intervals so let’s find ,
Mid Interval | 4.75 | 12 | 17 | 25 |
Frequency | 15 | 48 | 66 | 21 |
Note: Use the classes after continuity correction to find the mid interval.
Now that we have the mid interval, let’s find ,
Note: Remember that represents frequency NOT cumulative frequency.
Substitute into the formula for variance,
Therefore, the final answer is,
2. The heights in cm of 160 sunflower plants were measured. The results are summarised on the following cumulative frequency curve. (9709/53/M/J/21 number 1)
(a) Use the graph to estimate the number of plants with heights less than cm.
Draw construction lines at the height of cm and read off the respective cumulative frequency,
Therefore, the final answer is,
(b) Use the graph to estimate the th percentile of the distribution.
The formula to find the th percentile is,
Substitute the value of and ,
Draw construction lines at a cumulative frequency of and read off the respective height,
Therefore, the final answer is,
(c) Use the graph to estimate the interquartile range of the heights of these plants.
The formula to find the interquartile range is,
To find the upper quartile, use the formula,
Draw construction lines at a cumulative frequency of and read off the respective height,
To find the lower quartile, use the formula,
Draw construction lines at a cumulative frequency of and read off the respective height,
Substitute into the formula for interquartile range,
Therefore, the final answer is,