5.5.3 Normal Approximation to the Binomial Distribution

In this topic we will learn how to:

  • recall conditions under which the normal distribution can be used as an approximation to the binomial distribution, and use this approximation, with a continuity correction, in solving problems

Under certain conditions the normal distribution can be used as an approximation to the binomial distribution. As n increases calculations under the binomial distribution can get very long and complicated, to solve this problem we can use the normal distribution as an approximation to the binomial distribution. The conditions for normal approximation to the binomial distribution are:
n\textmd{ is sufficiently large to ensure that both }np > 5 \textmd{ and } nq > 5Note: Remember that q is 1 - p.

If,
X \sim B(n, p)And np > 5 and nq > 5, then,
X \sim N(np, npq)At AS level, any question that requires the ‘use of a suitable approximation’ will always be, a normal approximation to the binomial distribution. When asked to justify the normal approximation to the binomial distribution, you must show that the above conditions are satisfied.
\textbf{\textcolor{gray}{Continuity Correction}}Since we will be moving from a discrete distribution (binomial) to a continuous distribution (normal), continuity correction is required. Continuity correction is based on the idea that a discrete distribution only considers exact values, however, a continuous distribution considers every exact value to be a range of values, for example, 10 under continuous conditions is 9.5 < 10 < 10.5. For this reason, we apply continuity correction by simply adding or subtracting 0.5 depending on the situation. We do this to ensure that all of the required region is evaluated. There are four possible scenarios,


1. P(X < a)

When continuity correction is applied, this becomes,
P(X < a - 0.5)


2. P(X \le a)

When continuity correction is applied, this becomes,
P(X \le a + 0.5)


3. P(X > a)

When continuity correction is applied, this becomes,
P(X > a + 0.5)


4. P(X \ge a)

When continuity correction is applied, this becomes,
P(X \ge a - 0.5)



Let’s look at some past paper questions.

1. In a large college, 32\% of the students have blue eyes. A random sample of 80 students is chosen. Use an approximation to find the probability that fewer than 20 of these students have blue eyes. (9709/53/O/N/22 number 2)

Define the random variable,
X\textmd{ - r.v, number of students that have blue eyes}State the distribution and parameters of the random variable,
X \sim B(80, 0.32)Let’s check the values of np and nq,
np = 80 \times 0.32 \ \ \ \ nq = 80 \times (1 - 0.32)np = 25.6 > 5 \ \ \ \ nq = 54.4 > 5Since the conditions for a normal approximation to a binomial distribution are satisfied, we can approximate to the normal distribution,
X \sim N(np, npq)We’ve already found np, let’s find npq,
npq = 80 \times 0.32 \times (1 - 0.32)npq = 17.408Therefore, we can say that,
X \sim N(25.6, 17.408)Now let’s write out, mathematically, the problem given by the question,
P(X < 20)The problem is defined under a discrete distribution, however, since we have moved to a continuous distribution, we have to apply continuity correction. Using the first scenario outlined earlier, this becomes,
P(X < 20 - 0.5)P(X < 19.5)Now we can evaluate this probability, using the normal distribution we defined,
X \sim N(25.6, 17.408)P(X < 19.5)P\left(Z < \frac{19.5 - 25.6}{\sqrt{17.408}}\right)P(Z < -1.462)1 - \phi{(1.462)}1 - 0.92820.0718Therefore, the final answer is,
0.07182. In a large college, 28\% of the students do not play any musical instrument, 52\% play exactly one musical instrument and the remainder play two or more musical instruments. A random sample of 90 students from the college is chosen. Use an approximation to find the probability that fewer than 40 of these students play exactly one musical instrument. (9709/52/M/J/22 number 5)

Define the random variable,
X\textmd{ - r.v, number of students who play exactly one musical}\\ \textmd{ instrument}State the distribution and parameters of the random variable,
X \sim B(90, 0.52)Let’s check the values of np and nq,
np = 90 \times 0.52 \ \ \ \ nq = 90 \times (1 - 0.52)np = 46.8 > 5 \ \ \ \ nq = 43.2 > 5Since the conditions for a normal approximation to a binomial distribution are satisfied, we can approximate to the normal distribution,
X \sim N(np, npq)We’ve already found np, let’s find npq,
npq = 90 \times 0.52 \times (1 - 0.52)npq = 22.464Therefore, we can say that,
X \sim N(46.8, 22.464)Now let’s write out, mathematically, the problem given by the question,
P(X < 40)The problem is defined under a discrete distribution, however, since we have moved to a continuous distribution, we have to apply continuity correction. Using the first scenario above, this becomes,
P(X < 40 - 0.5)P(X < 39.5)Now we can evaluate this probability, using the normal distribution we defined,
X \sim N(46.8, 22.464)P(X < 39.5)P\left(Z < \frac{39.5 - 46.8}{\sqrt{22.464}}\right)P(Z < -1.540)1 - \phi{(1.540)}1 - 0.93820.0618Therefore, the final answer is,
0.06183. Every day Richard takes a flight between Astan and Bejin. On any day, the probability that the flight arrives early is 0.15, the probability that it arrives on time is 0.55 and the probability that it arrives late is 0.3. 60 days are chosen at random. Use an approximation to find the probability that Richard’s flight arrives early at least 12 times. (9709/52/M/J/22 number 5)

Define the random variable,
X\textmd{ - r.v, number of times flight arrives early}State the distribution and parameters of the random variable,
X \sim B(60, 0.15)Let’s check the values of np and nq,
np = 60 \times 0.15 \ \ \ \ nq = 60 \times (1 - 0.15)np = 9 > 5 \ \ \ \ nq = 51 > 5Since the conditions for a normal approximation to a binomial distribution are satisfied, we can approximate to the normal distribution,
X \sim N(np, npq)We’ve already found np, let’s find npq,
npq = 60 \times 0.15 \times (1 - 0.15)npq = 7.65Therefore, we can say that,
X \sim N(9, 7.65)Now let’s write out, mathematically, the problem given by the question,
P(X \ge 12)The problem is defined under a discrete distribution, however, since we have moved to a continuous distribution, we have to apply continuity correction. Using the fourth scenario above, this becomes,
P(X > 12 - 0.5)P(X > 11.5)Note: Under continuous conditions the < sign is the same as \le sign and the > sign is the \ge sign.

Now we can evaluate this probability, using the normal distribution we defined,
X \sim N(9, 7.65)P(X > 11.5)P\left(Z < \frac{11.5 - 9}{\sqrt{7.65}}\right)P(Z > 0.904)1 - \phi{(0.904)}1 - 0.81690.1831Therefore, the final answer is,
0.183