5.4.2 The Binomial Distribution

In this topic we will learn how to:

  • use the formula for probabilities for the binomial distribution, and recognise practical situations where this distribution is a suitable model
  • use formulae for the expectation and variance of the binomial distribution

The binomial distribution is a discrete distribution. This means it takes exact values only. It represents the number of successes of a particular event in a fixed number of trials. For a random variable to follow a binomial distribution, the following conditions have to be satisfied:

  • There are only two possible outcomes. Success and Failure.
  • There is a fixed number of trials
  • Trials are independent of each other
  • The probability of success is constant

If the following four conditions are met, then the random variable follows a binomial distribution.

The notation for a binomial distribution is as follows,
XB(n,p)X \sim B(n, p)Which reads ‘XX follows a binomial distribution, with the parameters nn and pp‘. XX represents a random variable, BB represents a binomial distribution, nn represents the number of trials, pp represents the probability of success. nn and pp are known as parameters, and these need to be evaluated before we can start calculating binomial probabilities.
The formula for binomial probabilities is,
P(X=r)=nCrpr(1p)nrP(X = r) ={^{n}C_{r}}p^{r}(1 - p)^{n - r}The above may also be written as,
Pr=(n r)pr(1p)nrP_{r} = \begin{pmatrix}n \ r\end{pmatrix}p^{r}(1 - p)^{n - r}In other literature, 1p1 - p, may be written as qq.

Under binomial conditions we can also calculate mean and variance. The formula for mean is,
E(X)=npE(X) = npThe formula for variance is,
Var(X)=npqVar(X) = npqWhere qq is 1p1 - p.

Let’s look at some past paper questions.

1. In Greenton, 70%70\% of the adults own a car. A random sample of 88 adults from Greenton is chosen. Find the probability that the number of adults in this sample who own a car is less than 66. (9709/52/F/M/20 number 5)

Let’s define the random variable,
X - r.v, number of adults who own a car in GreentonX\textmd{ - r.v, number of adults who own a car in Greenton}Note: You get the random variable from the question. Since the question is asking about the number of adults who own a car in Greenton, that becomes our random variable.

Let’s check if the random variable satisfies all 44 conditions of a binomial distribution,

There are only two possible outcomes, success and failure
Our random variable satisfies this, since the adult either owns a car (success) or they don’t (failure).

There is a fixed number of trials
Our random variable satisfies this, since we are sampling a total of 88 adults.

Trials are independent of each other
This condition is also satisfied because each adult is independent of the other, so they do not affect the chances of another adult having or not having a car.

Probability of success is constant
Since they are all from Greenton, they have the same probability of owning a car (70%)(70\%)

Since all 44 conditions are met, we can conclude that our random variable XX follows a binomial distribution. Let’s define it’s distribution,
XB(n,p)X \sim B(n, p)The number of trials is 88 and the probability of success is 70%70\%,
XB(8,0.70)X \sim B(8, 0.70)Note: Do not define the probability of success as a percentage in the distribution, give it as a fraction or a decimal number.

Now that we have defined the distribution, we can start calculating probabilities. The question requires us to find the probability that the number of adults who own a car is less than 66. This can be written as,
P(number of adults who own a car<6)P(\textmd{number of adults who own a car} < 6)P(X<6)P(X < 6)Since this is a discrete distribution,
P(X<6)=P(X=5,4,3,2,1,0)P(X < 6) = P(X = 5, 4, 3, 2, 1, 0)Calculating all those probabilities would be too tedious. Our number of trials is 88. 66 is very close to 88, so we can pick a route that allows us to go towards 88 instead,
P(X<6)=1P(X6)P(X < 6) = 1 - P(X \ge 6)Note: This works because the sum of all the probabilities under the distribution add up to 11

This can be written as,
P(X<6)=1P(X=6,7,8)P(X < 6) = 1 - P(X = 6, 7, 8)Instead of having to evaluate 66 probabilities, we only have to evaluate 33. Let’s use the formula for binomial probabilities to evaluate,
P(X=r)=nCrpr(1p)nrP(X = r) ={^{n}C_{r}}p^{r}(1 - p)^{n - r}Note: You get nn and pp from the distribution we defined.
P(X=6)=8C6(0.70)6(10.70)86P(X = 6) ={^{8}C_{6}}(0.70)^{6}(1 - 0.70)^{8 - 6}P(X=6)=0.29647548P(X = 6) = 0.29647548P(X=7)=8C7(0.70)7(10.70)87P(X = 7) ={^{8}C_{7}}(0.70)^{7}(1 - 0.70)^{8 - 7}P(X=7)=0.19765032P(X = 7) = 0.19765032P(X=8)=8C8(0.70)8(10.70)88P(X = 8) = {^{8}C_{8}}(0.70)^{8}(1 - 0.70)^{8 - 8}P(X=8)=0.05764801P(X = 8) = 0.05764801Now let’s go back to our equation,
P(X<6)=1P(X=6,7,8)P(X < 6) = 1 - P(X = 6, 7, 8)P(X<6)=1[0.29647548+0.19765032+0.05764801]P(X < 6) = 1 - [0.29647548 + 0.19765032 + 0.05764801]This simplifies to give,
P(X<6)=0.448P(X < 6) = 0.448Therefore, the final answer is,
P(X<6)=0.448P(X < 6) = 0.4482. In a certain college, 22%22\% of students own a car. (9709/53/M/J/20 number 2)

(a) 33 students from the college are chosen at random. Find the probability that all 33 students own a car.

Define the random variable,
X - r.v, number of students who own a carX\textmd{ - r.v, number of students who own a car}With time you should be able to tell by sight, if all four conditions are met. You can follow the steps in the question above, if you’re not comfortable. Define the distribution,
XB(3,0.22)X \sim B(3, 0.22)The question is asking us to find the probability that all 33 students own a car,
P(X=3)P(X = 3)Let’s use the formula for binomial probabilities,
P(X=r)=nCrpr(1p)nrP(X = r) = {^{n}C_{r}}p^{r}(1 - p)^{n - r}P(X=3)=3C3(0.22)3(10.22)33P(X = 3) = {^{3}C_{3}}(0.22)^{3}(1 - 0.22)^{3 - 3}P(X=3)=0.0107P(X = 3) = 0.0107Therefore, the final answer is,
P(X=3)=0.0107P(X = 3) = 0.0107(b) 1616 students from the college are chosen at random. Find the probability that the number of these students who own a car is at least 22 and at most 44.

Let’s define the random variable,
Y - r.v, number of students who own a carY\textmd{ - r.v, number of students who own a car}Note: The random variable looks similar to the one above, however, they have a different number of trials.

Define the distribution of the random variable,
YB(16,0.22)Y \sim B(16, 0.22)The questions is asking for the probability that the number of students who own a car is at least 22 and at most 44,
P(2Y4)P(2 \le Y \le 4)This can be written as,
P(2Y4)=P(X=2,3,4)P(2 \le Y \le 4) = P(X = 2, 3, 4)Let’s evaluate those probabilities, using the formula for binomial probabilities,
P(X=r)=nCrpr(1p)nrP(X = r) = {^{n}C_{r}}p^{r}(1 - p)^{n - r}P(X=2)=16C2(0.22)2(10.22)162P(X = 2) = {^{16}C_{2}}(0.22)^{2}(1 - 0.22)^{16 - 2}P(X=2)=0.1792053807P(X = 2) = 0.1792053807P(X=3)=16C3(0.22)3(10.22)163P(X = 3) = {^{16}C_{3}}(0.22)^{3}(1 - 0.22)^{16 - 3}P(X=3)=0.2358771678P(X = 3) = 0.2358771678P(X=4)=16C4(0.22)4(10.22)164P(X = 4) = {^{16}C_{4}}(0.22)^{4}(1 - 0.22)^{16 - 4}P(X=4)=0.2162207371P(X = 4) = 0.2162207371Now let’s go back to the equation,
P(2Y4)=P(X=2,3,4)P(2 \le Y \le 4) = P(X = 2, 3, 4)P(2Y4)=0.1792053807+0.2358771678+0.2162207371P(2 \le Y \le 4) = 0.1792053807 + 0.2358771678 + 0.2162207371P(2Y4)=0.631P(2 \le Y \le 4) = 0.631Therefore, the final answer is,
P(2Y4)=0.631P(2 \le Y \le 4) = 0.6313. Aman has designed a new logo for a sportswear company. A survey of a large number of customers found that 42%42\% of customers rated that logo as good. (9709/61/O/N/19 number 2)

(a) A random sample of 1010 customers is chosen. Find the probability that fewer than 88 of them rate the logo as good.

Let’s define the random variable,
X - r.v, number of customers that rate the logo as goodX\textmd{ - r.v, number of customers that rate the logo as good}State the distribution and parameters of the random variable,
XB(10,0.42)X \sim B(10, 0.42)The question is asking us to find the probability that fewer than 88 customers rate the logo as good,
P(X<8)P(X < 8)Since 88 is closer to 1010, we can rewrite this as,
P(X<8)=1P(X8)P(X < 8) = 1 - P(X \ge 8)Which can be written as,
P(X<8)=1P(X=8,9,10)P(X < 8) = 1 - P(X = 8, 9, 10)Let’s evaluate those probabilities, using the formula for binomial probabilities,
P(X=r)=nCrpr(1p)nrP(X = r) = {^{n}C_{r}}p^{r}(1 - p)^{n - r}P(X=8)=10C8(0.42)8(10.42)108P(X = 8) = {^{10}C_{8}}(0.42)^{8}(1 - 0.42)^{10 - 8}P(X=8)=0.01465759859P(X = 8) = 0.01465759859P(X=9)=10C9(0.42)9(10.42)109P(X = 9) = {^{10}C_{9}}(0.42)^{9}(1 - 0.42)^{10 - 9}P(X=9)=0.00235869402P(X = 9) = 0.00235869402P(X=10)=10C10(0.42)10(10.42)1010P(X = 10) = {^{10}C_{10}}(0.42)^{10}(1 - 0.42)^{10 - 10}P(X=10)=0.00017080198P(X = 10) = 0.00017080198Let’s go back to our equation,
P(X<8)=1P(X=8,9,10)P(X < 8) = 1 - P(X = 8, 9, 10)P(X<8)=1[0.01465759859+0.00235869402+0.00017080198]P(X < 8) = 1 - [0.01465759859 + 0.00235869402 + 0.00017080198]This simplifies to give,
P(X<8)=0.983P(X < 8) = 0.983Therefore, the final answer is,
P(X<8)=0.983P(X < 8) = 0.983(b) On another occasion, a random sample nn customers of the company is chosen. Find the smallest value of nn for which the probability that at least one person rates the logo as good is greater than 0.9950.995.

Let’s define the random variable,
Y- r.v, number of customers that rate the logo as goodY\textmd{- r.v, number of customers that rate the logo as good}State the distribution and parameters of the random variable,
YB(n,0.42)Y \sim B(n, 0.42)The question tells us that the probability that at least one person rates the logo as good is greater than 0.9950.995,
P(Y1)>0.995P(Y \ge 1) > 0.995This can be written as,
1P(Y<1)>0.9951 - P(Y < 1) > 0.995Which can be written as,
1P(Y=0)>0.9951 - P(Y = 0) > 0.995Let’s evaluate the probability that YY is 00,
P(Y=r)=nCrpr(1p)nrP(Y = r) = ^{n}C_{r}p^{r}(1 - p)^{n - r}P(Y=0)=nC0(0.42)0(10.42)n0P(Y = 0) = ^{n}C_{0}(0.42)^{0}(1 - 0.42)^{n - 0}P(Y=0)=1×1×0.58nP(Y = 0) = 1 \times 1 \times 0.58^{n}P(Y=0)=0.58nP(Y = 0) = 0.58^{n}Note: nC0^{n}C_{0} is equal to 1.

Let’s go back to our inequality,
1P(Y=0)>0.9951 - P(Y = 0) > 0.99510.58n>0.9951 - 0.58^{n} > 0.995Solve for nn,
0.58n<10.9950.58^{n} < 1 - 0.9950.58n<0.0050.58^{n} < 0.005At this point, to evaluate nn you can either use trial and error, where you substitute nn with random values until you reach a value of nn that does not satisfy the inequality OR you can use logarithms. In this example, we will use logarithms. Take logarithms of both sides,
0.58n<0.0050.58^{n} < 0.005log(0.58n)<log(0.005)\log(0.58^{n}) < \log(0.005)To simplify the left hand side, we will use one of the laws of logarithms,
log(bn)=nlogb\log(b^{n}) = n\log{b}Using the law above,
nlog(0.58)<log(0.005)n\log(0.58) < \log(0.005)Make nn the subject of the formula, by diving both sides by log(0.58)\log(0.58). Make sure to check whether log(0.58)\log(0.58) is positive or negative, because dividing by a negative number changes the sign of the inequality,
nlog(0.58)log(0.58)>log(0.005)log(0.58)\frac{n\log(0.58)}{\log(0.58)} > \frac{\log(0.005)}{\log(0.58)}n>9.72655231n > 9.72655231Note: log(0.58)\log(0.58) is indeed negative, hence the sign change.

Therefore, the smallest value of nn is,
n=10n = 10Note: This is the only case in which you should use logarithms at AS level. The law of logarithms outlined above is the only one you will need to know to tackle such questions. However, if you’re uncomfortable with using logarithms in such cases, use trial and error. Ensure that you show sufficient trials.