6.5.1 Normal Distribution Tests

In this topic we will learn how to:

  • understand the nature of a hypothesis test, the difference between one-tailed and two-tailed tests, and the terms null hypothesis, alternative hypothesis, significance level, rejection region (or critical region), acceptance region and test statistic.
  • formulate hypotheses and carry out a hypothesis test concerning the population mean in cases where the population is normally distributed with known variance or where a large sample is used
  • interpret outcomes of hypothesis terms in context with the question

\textbf{\textcolor{gray}{Hypothesis Test Terminology}}Let’s start by going over some key terminology for hypothesis tests. These terms will be used in both normal and discrete tests.

A null hypothesis is a neutral statement, that assumes that our data (population mean) is unchanged. It is denoted by,
H_{0}Which is pronounced H nought.

An alternative hypothesis is a statement that assumes that our data (population mean) has changed. It is denoted by,
H_{1}Which is pronounced H one.

A one-tailed test is one that tests for a definite increase or a definite decrease in the population mean. If it tests for a definite increase, it is an upper-tailed test. If it tests for a definite decrease, it is a lower-tailed test.

A two-tailed test is one that tests whether the population mean has changed. It does not specify whether we’re testing for a definite decrease or a definite increase.

The significance level represents the probability of the changes to the population mean happening by chance.

The rejection region (or critical region) is the region in which we reject the null hypothesis in favour of the alternative hypothesis because the changes to the population mean are significant.

The acceptance region is the region in which we accept the null hypothesis, and assume that any changes to the population mean are not significant and are due to chance.

The test statistic is the population mean derived from a sample that helps us assess how consistent the sample data is with the null hypothesis. It is usually denoted by,
\overline{x}
\textbf{\textcolor{gray}{z-tests}}A z-test is a hypothesis test for a population that follows a normal distribution. The steps we will outline below can also be used for a large sample that does not follow a normal distribution, in which we can use the Central Limit Theorem to assume that it’s sample mean follows a normal normal distribution. The steps are outlined below:
\textbf{Step 1}Define the random variable
\textbf{Step 2}State the distribution of the random variable\textbf{Step 3}State null and alternative hypotheses\textbf{Step 4}Define the distribution of the sample mean

Note: If you have a large sample which is not normally distributed, use CLT to define the distribution of the sample mean.
\textbf{Step 5}State the rejection rule\textbf{Step 6}Use the test statistic to calculate the z -value\textbf{Step 7}Use the rejection rule to determine whether to reject or accept H_{0}\textbf{Step 8}Conclude in context

Let’s look at some z-tests from past paper questions.

1. The time, in minutes, that John takes to travel to work has a normal distribution. Last year the mean and standard deviation were 26.5 and 4.8 respectively. This year John uses a different route and he finds that the mean time for his first 150 journeys is 27.5 minutes. Stating a necessary assumption, test at the 1\% significance level whether the mean time for his journey to work has increased. (9709/71/M/J/19 number 2)

Define the random variable,
\textmd{X-r.v, time, in minutes, that John takes to travel to work}State the distribution,
X \sim N(\mu, 4.8^{2})State null and alternative hypotheses,
H_{0}: \mu = 26.5H_{1}: \mu > 26.5Note: We’re testing for a definite increase, so this is an upper tail test, hence the > sign.

Define the distribution of the sample mean,
\overline{X} \sim N\left(26.5, \frac{4.8^{2}}{150}\right)State the rejection rule
Since the significance level is 1\%, find the z-value at 99\%,

z = \phi^{-1}(0.99)z = 2.522

Rendered by QuickLaTeX.com

The rejection rule is,
\textmd{Reject }H_{0}\textmd{ if }z > 2.326Use the test statistic to calculate the z-value,
z = \frac{\overline{x} - \mu}{\sigma}z = \frac{27.5 - 26.5}{\sqrt{\frac{4.8^{2}}{150}}}z = 2.552Compare the z-value to the rejection rule,
\textmd{Reject }H_{0}\textmd{ if }z > 2.3262.552 > 2.326\textmd{Reject }H_{0}\textmd{ in favour of }H_{1}Conclude in context,
There is evidence, at the 1\% significance level, that the mean time for John’s journey to work has increased.

2. Harry has a five-sided spinner with sectors coloured blue, green, red, yellow and black. Harry thinks the spinner may be biased. He plans to carry out a hypothesis test with the following hypotheses.
H_{0}: P(\textmd{the spinner lands on blue}) = \frac{1}{5}
H_{1}: P(\textmd{the spinner lands on blue}) \neq \frac{1}{5}
Harry spins the spinner 300 times. It lands on blue on 45 spins. Use a suitable approximation to carry out Harry’s test at the 5\% significance level. (9709/62/F/M/22 number 2)

Define the random variable,
X-\textmd{r.v, number of times the spinner lands on blue}State the distribution,
X \sim B(300, p)State null and alternative hypotheses,
H_{0}: P(\textmd{the spinner lands on blue}) = \frac{1}{5}H_{1}: P(\textmd{the spinner lands on blue}) \neq \frac{1}{5}Note: We’re not testing for a definite increase or definite decrease, we’re testing for a change, therefore, it is a two-tailed test. It is denoted by the \neq sign in the alternative hypothesis.

Define the distribution of the sample mean,
\textmd{Since }n = 300 > 30\textmd{ , using CLT,}\overline{X} \sim N(np, npq)\textmd{Assuming }H_{0}\textmd{ is true,}\overline{X} \sim N\left(300 \times \frac{1}{5}, 300 \times \frac{1}{5} \times \frac{4}{5}\right)\overline{X} \sim N(60, 48)State the rejection rule
Since the significance level is 5\%, and it is a two-tailed test, find the z-value at 97.5\%,
z = \phi^{-1}(0.975)z = 1.96

Rendered by QuickLaTeX.com

Note: Since it is a two-tailed test, divide the value of the significance level by 2. Put half of the significance level on the lower tail and put the other half on the upper tail.

The rejection rule is,
\textmd{Reject }H_{0}\textmd{ if }|z| > 1.96Note: |z| > 1.96 is the same as z < -1.96 \textmd{ and } z > 1.96.

Since we have approximated from a discrete distribution (binomial) to a continuous distribution (normal), we have to do continuity correction on the test statistic. In the binomial distribution the test statistic is 45. However, in a continuous distribution 45 represents 44.5 < \overline{x} < 45.5. If the test statistic is less than the population mean (in our case 45 is smaller than 60) then treat it like a lower tail test and take the upper bound, however, if the test statistic is larger than the mean, take the lower bound and treat it like an upper tail test. So in this case, we will take the upper bound, 45.5.
Use the test statistic to calculate the z-value,
z = \frac{\overline{x} - \mu}{\sigma}z = \frac{45.5 - 60}{\sqrt{48}} z = -2.093 Compare the z-value to the rejection rule,
\textmd{Reject }H_{0}\textmd{ if } |z| > 1.96-2.093 < -1.96\textmd{Reject }H_{0}\textmd{ in favour of }H_{1}Conclude in context,
There is evidence, at the 5\% significance level, that the spinner is biased.