10 March 2012 Archives

Imagine that there is a rare genetic disease that affects 1 in every 100 people at random. There is a test for this disease that has a 99% accuracy rate: of every 100 people tested it will give the correct answer to 99 of those people.

If you have the test, and the result of the test is positive, what is the chance that you have the disease?

If you think the answer is 99% then you are incorrect; this is because of the base rate fallacy – you have failed to take the base rate (of the disease) into account.

In this situation there are four possible outcomes:

	Affected by disease	Not affected by disease
Test correct	Affected by disease, and test gives correct result. (DC)	Not affected by disease, and test gives correct result. (NC)
Test incorrect	Affected by disease, and test gives incorrect result. (DI)	Not affected by disease, and test gives incorrect result. (NI)

This is easier to understand if we map the contents of the probability space using a tree diagram, as shown below.

In two of these cases the result of the test is positive, but in only one of them do you have the disease.

P(DC) = P(Affected) × P(Test correct)
P(DC) = 0.01 × 0.99
P(DC) = 0.0099 = 1 in 101

The other case that results in a positive result, when you don’t have the disease and the test in incorrect has the same 1 in 101 probability: P(NI) = 0.0099.

Of the two remaining cases, not having the disease and getting a correct negative test result takes up the vast majority of the remaining probability space: P(NC) = 0.9801 or 1 in 1.02. The chance of having the disease and getting an incorrect test result is extremely small: P(DI) = 0.0001 or 1 in 10000.

MrReid.org

Stuff that interests Mr Reid, a physicist and teacher

Daily Archives: 10th March

The base rate fallacy