Type I and Type II errors, power of a test, decision rule, relationship between confidence interval and hypothesis tests

CFA level I / Quantitative Methods: Application / Hypothesis Testing / Type I and Type II errors, power of a test, decision rule, relationship between confidence interval and hypothesis tests

The rejection of null hypothesis depends on the significance level. There are two possible actions based on the results of the hypothesis testing. One is to reject the null hypothesis, and the other is to fail to reject the null hypothesis. There are four possible outcomes possible:

(1) Reject a false null hypothesis.

(2) Reject a true null hypothesis. It is called a Type I error. It occurs when rejecting the null hypothesis which is actually correct.

(3) Fail to reject a false null hypothesis. It is called a Type II error. It occurs when the actual null hypothesis is false, but we don't reject it from hypothesis testing.

(4) Fail to reject a true null hypothesis.

	True situation
Decision per the hypothesis testing	H₀ True	H₀ False
Do not reject H₀	Correct decision	Type II Error
Reject H₀	Type I Error	Correct decision

Type I and Type II errors are mutually exclusive errors. If we mistakenly reject the null hypothesis, then we can only make Type I error. If we mistakenly fail to reject the null hypothesis, then we can only make Type II error. The probability of Type I error is equal to the level of significance of the hypothesis test, denoted by α. A 5 percent level of significance means that there is a 5 percent probability of rejecting a true null hypothesis.

The probability of Type II error is denoted by β. There is a tradeoff between the two types of error. An increase in the probability of Type I error will lead to a decrease in the probability of Type II error and vice versa. The only way to decrease the probability of both types of error is to increase the sample size, n.

The power of a test is the probability of correctly rejecting the null hypothesis i.e. the probability of rejecting the null hypothesis when it is false. When the null hypothesis is false, then we can either predict it correctly or incorrectly. When we predict the null hypothesis incorrectly when it is false, then that leads to Type II error. Hence, the power of a test also equals one minus the probability of Type II error.

When more than one statistic is available to conduct a hypothesis test, we should prefer the most powerful. The standard approach to hypothesis testing is to specify a level of significance prior to calculating the test statistic. If we choose the level of significance after the calculation of test statistic, then we might get influenced by the result.

There are three conventional significance levels to conduct the hypothesis tests: 0.01, 0.05, and 0.10. If we can reject a null hypothesis at 0.10, then we can easily reject that at 0.05 and 0.01 as well. If we reject a null hypothesis at a lower level of significance, then there is strong evidence that the null hypothesis is false.

The decision rule of a hypothesis test depends on whether the test is a two-tailed test or one-tailed test. We need to compare the value of the calculated test statistic to the critical value of the test statistic. The critical value of a test statistic is greater in magnitude for a two-tailed test as compared to a one-tailed test for the same level of significance. If the level of significance is 5 percent and the test is one-tailed, then the critical value of the test statistic will be the point beyond which the extreme 5 percent values lie. For a right-tailed test, the critical value will be positive, and the extreme 5 percent will lie on the right side area of that point. For a left-tailed test, the critical value will be negative, and the extreme 5 percent will lie on the left side area of that point. If our calculated test statistic lies in that extreme area, then we can reject the null hypothesis.

For a two-tailed test and 5 percent level of significance, the extreme 5 percent will lie on both sides i.e. 2.5 percent on the right side and 2.5 percent on the left side. If the calculated test statistic lies in those extreme areas, then we can reject the null hypothesis.

For a two-tailed test, the confidence interval is the value between those two extreme critical points. So, if the sample statistic does not lie in the confidence interval, then we can reject the null hypothesis.

Previous LOS: Steps of hypothesis testing, null and alternative hypothesis, one-tailed and two-tailed tests

Next LOS: Statistical result and economically meaningful result