Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

In hypothesis testing we must balance the 2 types of errors, Type I and Type II.

ID: 3125062 • Letter: I

Question

In hypothesis testing we must balance the 2 types of errors, Type I and Type II. This is not unlike what happens in a jury trial. This week we will discuss the similarities and differences between hypothesis testing and the trial of a defendant. Here are some suggestions of ideas you might discuss: how difficult is it to balance the two types of errors? how is a null hypothesis similar to the assumption of innocence in a jury trial? Is it possible to prove a hypothesis true or false? Is it possible to prove innocence or guilt? What about the burden of proof?

Explanation / Answer

Type I Error

The first kind of error that is possible involves the rejection of a null hypothesis that is actually true.

This kind of error is called a type I error, and is sometimes called an error of the first kind.

Type I errors can be controlled. The value of alpha, which is related to the level of significance that we selected has a direct bearing on type I errors.

Alpha is the maximum probability that we have a type I error. For a 95% confidence level, the value of alpha is 0.05. This means that there is a 5% probability that we will reject a true null hypothesis. In the long run, one out of every twenty hypothesis tests that we perform at this level will result in a type I error.

Type II Error

The other kind of error that is possible occurs when we do not reject a null hypothesis that is false. This sort of error is called a type II error, and is also referred to as an error of the second kind.

Type II errors are equivalent to false negatives. If we think back again to the scenario in which we are testing a drug, what would a type II error look like? A type II error would occur if we accepted that the drug had no effect on a disease, but in reality it did.

The probability of a type II error is given by the Greek letter beta. This number is related to the power or sensitivity of the hypothesis test, denoted by 1 – beta.

Given limited resources, optimizing quality doesn't just involve minimizing the number of errors; it also involves balancing different kinds of errors. There are two basic types of errors one can make:

Doing things you should not do.

Not doing things you should do.

For the theologically inclined these would be sins of commission and sins of omission, respectively. Much of statistics deals with bounding the probability of making errors in judgments. From statistics, except in very specialized circumstances, you can't simultaneously control for both type I and type II errors. Therefore, generally the worse kind of error is phrased as a type I error and then the machinery of statistics is applied to the problem in that form; p-values are a bound of the probability making a type I error. However, that doesn't imply that the other type of error is unimportant or should be ignored. A well-publicized example of the need to balance both kinds of errors has occurred in the FDA's drug approval process. A pharmaceutical company must demonstrate safety and efficacy of a new drug before it goes to market. The FDA is primarily concerned with preventing the type I error of releasing of an unsafe drug to the public. However, the type II error of keeping useful drugs off the market can also be problematic and was raised as an issue during the early AIDS crisis.

It only takes one good piece of evidence to send a hypothesis down in flames but an endless amount to prove it correct. If the null is rejected then logically the alternative hypothesis is accepted. This is why both the justice system and statistics concentrate on disproving or rejecting the null hypothesis rather than proving the alternative. It's much easier to do. If a jury rejects the presumption of innocence, the defendant is pronounced guilty.

Type I errors: Unfortunately, neither the legal system or statistical testing are perfect. A jury sometimes makes an error and an innocent person goes to jail. Statisticians, being highly imaginative, call this a type I error. Civilians call it a travesty.

In the justice system, failure to reject the presumption of innocence gives the defendant a not guilty verdict. This means only that the standard for rejecting innocence was not met. It does not mean the person really is innocent. It would take an endless amount of evidence to actually prove the null hypothesis of innocence.

Type II errors: Sometimes, guilty people are set free. Statisticians have given this error the highly imaginative name, type II error.

Americans find type II errors disturbing but not as horrifying as type I errors. A type I error means that not only has an innocent person been sent to jail but the truly guilty person has gone free. In a sense, a type I error in a trial is twice as bad as a type II error. Needless to say, the American justice system puts a lot of emphasis on avoiding type I errors. This emphasis on avoiding type I errors, however, is not true in all cases where statistical hypothesis testing is done.

In statistical hypothesis testing used for quality control in manufacturing, the type II error is considered worse than a type I. Here the null hypothesis indicates that the product satisfies the customer's specifications. If the null hypothesis is rejected for a batch of product, it cannot be sold to the customer. Rejecting a good batch by mistake--a type I error--is a very expensive error but not as expensive as failing to reject a bad batch of product--a type II error--and shipping it to a customer. This can result in losing the customer and tarnishing the company's reputation.