5.1.5 Cumulative Frequency Graphs

In this topic we will learn how to:

  • draw and interpret cumulative frequency graphs

A cumulative frequency graph is used to represent grouped continuous data. It is represented in the form of an ss-shaped curve.

Cumulative frequency is the total frequency of the previous classes up to and including the present class.

To draw a cumulative frequency graph you need the following information:

  • upper bound of a class
  • cumulative frequency

To calculate the cumulative frequency, add the frequencies of previous classes together with that of the current class.Variation and Measures of Central Tendency for Cumulative Frequency Graph\large\textcolor{gray}{\textbf{Variation and Measures of Central Tendency}\\ \textbf{ for Cumulative Frequency Graph}}Mode\textbf{\textcolor{gray}{Mode}}A cumulative frequency curve represents grouped data, so we cannot find the mode but instead we can find the modal class. The modal class is the class with the highest frequency.n-th percentile\textcolor{gray}{n\textbf{-th percentile}}To calculate the nn-th percentile, we use the formulan-th percentile=xn100n\textmd{-th percentile} = \frac{xn}{100}Where xx represents the percentile and nn represents the sample size.Lower Quartile\textbf{\textcolor{gray}{Lower Quartile}}To calculate the lower quartile, we use the formula,
q1=14nq_{1} = \frac{1}{4}nWhere q1q_{1} represents the lower quartile.Upper Quartile\textbf{\textcolor{gray}{Upper Quartile}}To calculate the upper quartile, we use the formula,
q3=34nq_{3} = \frac{3}{4}nWhere q3q_{3} represents the upper quartile.
Intequartile Range\textbf{\textcolor{gray}{Intequartile Range}}To calculate the interquartile range, we use the formula,
IQR=q3q1IQR = q_{3} - q_{1}Where IQRIQR represents the interquartile range, q3q_{3} represents the upper quartile, q1q_{1} represents the lower quartile.
Mean\textbf{\textcolor{gray}{Mean}}To calculate the mean when data is displayed in the form of a cumulative frequency curve, we need to first find the mid interval. This is the middle value for each value. We use the formula,
x=ΣxfΣf\overline{x} = \frac{\Sigma xf}{\Sigma f}Where x\overline{x} represents mean, xx represents the mid-interval, ff represents the frequency.
Variance\textbf{\textcolor{gray}{Variance}}To calculate the variance, we use the formula,
σ2=Σx2fΣfx2\sigma^{2} = \frac{\Sigma x^{2}f}{\Sigma f} - \overline{x}^{2}Where σ2\sigma^{2} represents variance, xx represents the mid interval, ff represents the frequency, x\overline{x} represents mean.
Standard Deviation\textbf{\textcolor{gray}{Standard Deviation}}Standard deviation is the square root of variance. Therefore, the formula for standard deviation is,
σ=Σx2fΣfx2\sigma = \sqrt{\frac{\Sigma x^{2}f}{\Sigma f} - \overline{x}^{2}}Where σ\sigma represents standard deviation, xx represents the mid interval, ff represents the frequency, x\overline{x} represents mean.

Let’s look at some past paper questions.

1. Helen measures the lengths of 150150 fish of a certain species in a large pond. These lengths, correct to the nearest centimetre, are summarised, are summarised in the following table. (9709/52/F/M/20 number 7)

Length (cm)
0 – 9
10 – 14
15 – 19
20 – 30
Frequency
15
48
66
21

(a) Draw a cumulative frequency graph to illustrate the data.

Find the cumulative frequency,

Length (cm)
0 – 9
10 – 14
15 – 19
20 – 30
Cumulative Frequency
15
63
129
150

You will notice that there are gaps between the classes. To remove those gaps we need to do continuity correction. Simply subtract 0.50.5 from the lower bounds and add 0.50.5 to the upper bounds,

Length (cm)
0 – 9.5
9.5 – 14.5
14.5 – 19.5
19.5 – 30.5
Cumulative Frequency
15
63
129
150

Note: We do not subtract 0.50.5 from 00 because we would end up with a negative value for length, which does not exist.

Now that there are no gaps, we can plot the upper bounds against the cumulative frequency. Label the yy-axis with cumulative frequency. Label the xx-axis with the class title.

Rendered by QuickLaTeX.com

(b) 40%40\% of these fish have a length of dd cm or more. Use your graph to estimate the value of dd.

This means 60%60\% of fish have a length less than dd cm. Let’s find 60%60\% of 150150,
60100×150\frac{60}{100} \times 15090\textcolor{red}{90}Draw construction lines at a cumulative frequency of 9090 and read off the length,
d=16.5 cmd = 16.5 \textmd{ cm}Therefore, the final answer is,
d=16.5 cmd = 16.5 \textmd{ cm}The mean length of these 150150 fish is 15.29515.295 cm.
(c) Calculate an estimate for the variance of the lengths of the fish.


The formula for variance is,
σ2=Σx2fΣfx2\sigma^{2} = \frac{\Sigma x^{2}f}{\Sigma f} - \overline{x}^{2}We already have the mean, we need to find Σx2fΣf\frac{\Sigma x^{2}f}{\Sigma f}. xx represents the mid intervals so let’s find xx,

Mid Interval
4.75
12
17
25
Frequency
15
48
66
21

Note: Use the classes after continuity correction to find the mid interval.

Now that we have the mid interval, let’s find Σx2fΣf\frac{\Sigma x^{2}f}{\Sigma f},
Σx2fΣf=4.752(15)+122(48)+172(66)+252(21)150\frac{\Sigma x^{2}f}{\Sigma f} = \frac{4.75^{2}(15) + 12^{2}(48) + 17^{2}(66) + 25^{2}(21)}{150}Σx2fΣf=262.99653\frac{\Sigma x^{2}f}{\Sigma f} = 262.99653Note: Remember that ff represents frequency NOT cumulative frequency.

Substitute into the formula for variance,
σ2=Σx2fΣfx2\sigma^{2} = \frac{\Sigma x^{2}f}{\Sigma f} - \overline{x}^{2}σ2=262.99653(15.295)2\sigma^{2} = 262.99653 - (15.295)^{2}σ2=29.059225\sigma^{2} = 29.059225σ2=29.1\sigma^{2} = 29.1Therefore, the final answer is,
σ2=29.1\sigma^{2} = 29.12. The heights in cm of 160 sunflower plants were measured. The results are summarised on the following cumulative frequency curve. (9709/53/M/J/21 number 1)

Rendered by QuickLaTeX.com

(a) Use the graph to estimate the number of plants with heights less than 100100 cm.

Draw construction lines at the height of 100100 cm and read off the respective cumulative frequency,
60\textcolor{red}{60}Therefore, the final answer is,
60 plants60 \textmd{ plants}(b) Use the graph to estimate the 6565th percentile of the distribution.

The formula to find the nnth percentile is,
xn100\frac{xn}{100}Substitute the value of xx and nn,
(65)(160)100\frac{(65)(160)}{100}104104Draw construction lines at a cumulative frequency of 104104 and read off the respective height,
136\textcolor{#0f0}{136}Therefore, the final answer is,
136136(c) Use the graph to estimate the interquartile range of the heights of these plants.

The formula to find the interquartile range is,
IQR=q3q1IQR = q_{3} - q_{1}To find the upper quartile, use the formula,
q3=34nq_{3} = \frac{3}{4}nq3=34(160)q_{3} = \frac{3}{4}(160)q3=120q_{3} = 120Draw construction lines at a cumulative frequency of 120120 and read off the respective height,
q3=150q_{3} = 150To find the lower quartile, use the formula,
q1=14nq_{1} = \frac{1}{4}nq1=14(160)q_{1} = \frac{1}{4}(160)q1=40q_{1} = 40Draw construction lines at a cumulative frequency of 4040 and read off the respective height,
q1=76q_{1} = 76Substitute into the formula for interquartile range,
IQR=q3q1IQR = q_{3} - q_{1}IQR=15076IQR = 150 - 76IQR=74IQR = 74Therefore, the final answer is,
IQR=74IQR = 74