5.4.2 The Binomial Distribution

In this topic we will learn how to:

  • use the formula for probabilities for the binomial distribution, and recognise practical situations where this distribution is a suitable model
  • use formulae for the expectation and variance of the binomial distribution

The binomial distribution is a discrete distribution. This means it takes exact values only. It represents the number of successes of a particular event in a fixed number of trials. For a random variable to follow a binomial distribution, the following conditions have to be satisfied:

  • There are only two possible outcomes. Success and Failure.
  • There is a fixed number of trials
  • Trials are independent of each other
  • The probability of success is constant

If the following four conditions are met, then the random variable follows a binomial distribution.

The notation for a binomial distribution is as follows,
X \sim B(n, p)Which reads ‘X follows a binomial distribution, with the parameters n and p‘. X represents a random variable, B represents a binomial distribution, n represents the number of trials, p represents the probability of success. n and p are known as parameters, and these need to be evaluated before we can start calculating binomial probabilities.
The formula for binomial probabilities is,
P(X = r) ={^{n}C_{r}}p^{r}(1 - p)^{n - r}The above may also be written as,
P_{r} = \begin{pmatrix}n \ r\end{pmatrix}p^{r}(1 - p)^{n - r}In other literature, 1 - p, may be written as q.

Under binomial conditions we can also calculate mean and variance. The formula for mean is,
E(X) = npThe formula for variance is,
Var(X) = npqWhere q is 1 - p.

Let’s look at some past paper questions.

1. In Greenton, 70\% of the adults own a car. A random sample of 8 adults from Greenton is chosen. Find the probability that the number of adults in this sample who own a car is less than 6. (9709/52/F/M/20 number 5)

Let’s define the random variable,
X\textmd{ - r.v, number of adults who own a car in Greenton}Note: You get the random variable from the question. Since the question is asking about the number of adults who own a car in Greenton, that becomes our random variable.

Let’s check if the random variable satisfies all 4 conditions of a binomial distribution,

There are only two possible outcomes, success and failure
Our random variable satisfies this, since the adult either owns a car (success) or they don’t (failure).

There is a fixed number of trials
Our random variable satisfies this, since we are sampling a total of 8 adults.

Trials are independent of each other
This condition is also satisfied because each adult is independent of the other, so they do not affect the chances of another adult having or not having a car.

Probability of success is constant
Since they are all from Greenton, they have the same probability of owning a car (70\%)

Since all 4 conditions are met, we can conclude that our random variable X follows a binomial distribution. Let’s define it’s distribution,
X \sim B(n, p)The number of trials is 8 and the probability of success is 70\%,
X \sim B(8, 0.70)Note: Do not define the probability of success as a percentage in the distribution, give it as a fraction or a decimal number.

Now that we have defined the distribution, we can start calculating probabilities. The question requires us to find the probability that the number of adults who own a car is less than 6. This can be written as,
P(\textmd{number of adults who own a car} < 6)P(X < 6)Since this is a discrete distribution,
P(X < 6) = P(X = 5, 4, 3, 2, 1, 0)Calculating all those probabilities would be too tedious. Our number of trials is 8. 6 is very close to 8, so we can pick a route that allows us to go towards 8 instead,
P(X < 6) = 1 - P(X \ge 6)Note: This works because the sum of all the probabilities under the distribution add up to 1

This can be written as,
P(X < 6) = 1 - P(X = 6, 7, 8)Instead of having to evaluate 6 probabilities, we only have to evaluate 3. Let’s use the formula for binomial probabilities to evaluate,
P(X = r) ={^{n}C_{r}}p^{r}(1 - p)^{n - r}Note: You get n and p from the distribution we defined.
P(X = 6) ={^{8}C_{6}}(0.70)^{6}(1 - 0.70)^{8 - 6}P(X = 6) = 0.29647548P(X = 7) ={^{8}C_{7}}(0.70)^{7}(1 - 0.70)^{8 - 7}P(X = 7) = 0.19765032P(X = 8) = {^{8}C_{8}}(0.70)^{8}(1 - 0.70)^{8 - 8}P(X = 8) = 0.05764801Now let’s go back to our equation,
P(X < 6) = 1 - P(X = 6, 7, 8)P(X < 6) = 1 - [0.29647548 + 0.19765032 + 0.05764801]This simplifies to give,
P(X < 6) = 0.448Therefore, the final answer is,
P(X < 6) = 0.4482. In a certain college, 22\% of students own a car. (9709/53/M/J/20 number 2)

(a) 3 students from the college are chosen at random. Find the probability that all 3 students own a car.

Define the random variable,
X\textmd{ - r.v, number of students who own a car}With time you should be able to tell by sight, if all four conditions are met. You can follow the steps in the question above, if you’re not comfortable. Define the distribution,
X \sim B(3, 0.22)The question is asking us to find the probability that all 3 students own a car,
P(X = 3)Let’s use the formula for binomial probabilities,
P(X = r) = {^{n}C_{r}}p^{r}(1 - p)^{n - r}P(X = 3) = {^{3}C_{3}}(0.22)^{3}(1 - 0.22)^{3 - 3}P(X = 3) = 0.0107Therefore, the final answer is,
P(X = 3) = 0.0107(b) 16 students from the college are chosen at random. Find the probability that the number of these students who own a car is at least 2 and at most 4.

Let’s define the random variable,
Y\textmd{ - r.v, number of students who own a car}Note: The random variable looks similar to the one above, however, they have a different number of trials.

Define the distribution of the random variable,
Y \sim B(16, 0.22)The questions is asking for the probability that the number of students who own a car is at least 2 and at most 4,
P(2 \le Y \le 4)This can be written as,
P(2 \le Y \le 4) = P(X = 2, 3, 4)Let’s evaluate those probabilities, using the formula for binomial probabilities,
P(X = r) = {^{n}C_{r}}p^{r}(1 - p)^{n - r}P(X = 2) = {^{16}C_{2}}(0.22)^{2}(1 - 0.22)^{16 - 2}P(X = 2) = 0.1792053807P(X = 3) = {^{16}C_{3}}(0.22)^{3}(1 - 0.22)^{16 - 3}P(X = 3) = 0.2358771678P(X = 4) = {^{16}C_{4}}(0.22)^{4}(1 - 0.22)^{16 - 4}P(X = 4) = 0.2162207371Now let’s go back to the equation,
P(2 \le Y \le 4) = P(X = 2, 3, 4)P(2 \le Y \le 4) = 0.1792053807 + 0.2358771678 + 0.2162207371P(2 \le Y \le 4) = 0.631Therefore, the final answer is,
P(2 \le Y \le 4) = 0.6313. Aman has designed a new logo for a sportswear company. A survey of a large number of customers found that 42\% of customers rated that logo as good. (9709/61/O/N/19 number 2)

(a) A random sample of 10 customers is chosen. Find the probability that fewer than 8 of them rate the logo as good.

Let’s define the random variable,
X\textmd{ - r.v, number of customers that rate the logo as good}State the distribution and parameters of the random variable,
X \sim B(10, 0.42)The question is asking us to find the probability that fewer than 8 customers rate the logo as good,
P(X < 8)Since 8 is closer to 10, we can rewrite this as,
P(X < 8) = 1 - P(X \ge 8)Which can be written as,
P(X < 8) = 1 - P(X = 8, 9, 10)Let’s evaluate those probabilities, using the formula for binomial probabilities,
P(X = r) = {^{n}C_{r}}p^{r}(1 - p)^{n - r}P(X = 8) = {^{10}C_{8}}(0.42)^{8}(1 - 0.42)^{10 - 8}P(X = 8) = 0.01465759859P(X = 9) = {^{10}C_{9}}(0.42)^{9}(1 - 0.42)^{10 - 9}P(X = 9) = 0.00235869402P(X = 10) = {^{10}C_{10}}(0.42)^{10}(1 - 0.42)^{10 - 10}P(X = 10) = 0.00017080198Let’s go back to our equation,
P(X < 8) = 1 - P(X = 8, 9, 10)P(X < 8) = 1 - [0.01465759859 + 0.00235869402 + 0.00017080198]This simplifies to give,
P(X < 8) = 0.983Therefore, the final answer is,
P(X < 8) = 0.983(b) On another occasion, a random sample n customers of the company is chosen. Find the smallest value of n for which the probability that at least one person rates the logo as good is greater than 0.995.

Let’s define the random variable,
Y\textmd{- r.v, number of customers that rate the logo as good}State the distribution and parameters of the random variable,
Y \sim B(n, 0.42)The question tells us that the probability that at least one person rates the logo as good is greater than 0.995,
P(Y \ge 1) > 0.995This can be written as,
1 - P(Y < 1) > 0.995Which can be written as,
1 - P(Y = 0) > 0.995Let’s evaluate the probability that Y is 0,
P(Y = r) = ^{n}C_{r}p^{r}(1 - p)^{n - r}P(Y = 0) = ^{n}C_{0}(0.42)^{0}(1 - 0.42)^{n - 0}P(Y = 0) = 1 \times 1 \times 0.58^{n}P(Y = 0) = 0.58^{n}Note: ^{n}C_{0} is equal to 1.

Let’s go back to our inequality,
1 - P(Y = 0) > 0.9951 - 0.58^{n} > 0.995Solve for n,
0.58^{n} < 1 - 0.9950.58^{n} < 0.005At this point, to evaluate n you can either use trial and error, where you substitute n with random values until you reach a value of n that does not satisfy the inequality OR you can use logarithms. In this example, we will use logarithms. Take logarithms of both sides,
0.58^{n} < 0.005\log(0.58^{n}) < \log(0.005)To simplify the left hand side, we will use one of the laws of logarithms,
\log(b^{n}) = n\log{b}Using the law above,
n\log(0.58) < \log(0.005)Make n the subject of the formula, by diving both sides by \log(0.58). Make sure to check whether \log(0.58) is positive or negative, because dividing by a negative number changes the sign of the inequality,
\frac{n\log(0.58)}{\log(0.58)} > \frac{\log(0.005)}{\log(0.58)}n > 9.72655231Note: \log(0.58) is indeed negative, hence the sign change.

Therefore, the smallest value of n is,
n = 10Note: This is the only case in which you should use logarithms at AS level. The law of logarithms outlined above is the only one you will need to know to tackle such questions. However, if you’re uncomfortable with using logarithms in such cases, use trial and error. Ensure that you show sufficient trials.