Thursday, September 26, 2013

surprises, part 2

In my last post, I described a classic problem in probability: suppose a certain disease occurs in 0.5% of a given population, and there is a test that detects it with 99% accuracy, but also returns a false positive 1% of the time it is used on a healthy person. What is the conditional probability that an individual has the disease, given that they have a positive result from the test? The answer, somewhat surprisingly, turns out to be less than a third.

When we discussed this in my probability class, one student asked a very sensible question: What if we test the person twice?

This question seemed worth investigating. As I see it, the question can be interpreted two ways. On one hand, what if we tested everyone twice? How would that affect the conditional probability given above? On the other hand, what if we only gave the test a second time to those who had a positive first test? Would we be more likely to filter out those who are actually ill in that case, having restricted to a population in which the disease is more prevalent? Do these two methods produce different results?

To begin with, let’s return to the original question and analyze it more thoroughly by introducing some variables. Let $r$ be the prevalence of the disease in the total population (which can be interpreted as the probability that any particular individual has the disease). Suppose the test we have returns a true positive (a positive result for someone who is ill) with probability $p$ (called the sensitivity of the test), and it returns a false positive (a positive result for someone who is well) with probability $q$ (the value $1 - q$ is called the test’s specificity). Bayes’ formula then says that the probability of having the illness given a positive test result is \[ P(r) = \frac{r \cdot p}{r \cdot p + (1 - r) \cdot q}. \] If we fix $p$ and $q$ and let $r$ vary, we get a graph like the following:

(drawn here with $p = 0.98$ and $q = 0.05$; you can click on the graph to go to an interactive version). Notice the large derivative for small values of $r$; that low conditional probability we got at the beginning? Was essentially an artifact of the disease itself being fairly uncommon. (As one student slyly put it, “so the way to make a positive test more likely to mean you’re sick is to give more people the disease?”) Raising the value of $p$ doesn’t change the graph much. The real problem lies in the false positives; if the disease is sufficiently rare, then having any chance at all of false positives ($q > 0$) means that the false positives will outnumber the true positives.

If we change the situation so that every time an individual is tested we administer the test twice, then a few things happen. First, the chance of getting two false positives when testing a healthy individual is $q^2$, which is generally much smaller than $q$. Meanwhile, the chance of getting two positives when testing a sick individual is $p^2$, smaller than $p$ but not by much. The result is a much steeper curve for low-prevalence diseases:

(the red curve is the same as before; the purple curve represents the probability of having the illness given two positive tests). Effectively, we have created a new test with a much reduced chance of false positives.

But testing everyone twice seems unnecessary. Just as a low prevalence leads to a reduced probability that a positive result means the disease is actually present, so it also reduces the probability that one is ill given a negative result. Here is the graph of this latter conditional probability (that is, the prevalence of the disease among those who have a negative test):

So we shouldn’t worry too much about those who have a negative test. We can give the test a second time just to those who have a positive first test. In effect, rather than creating a new test as before, we have restricted to a new population, in which the disease is far more prevalent (as given by the original conditional probability $P(r)$). Here is the graph of the original function $P(r)$ (again in red) together with the graph (in orange) of the probability of having the disease given a positive result and being among those who had a first positive test:

Do you notice something about the purple and orange curves in the graphs above? They are the same. I admit, this surprised me at first. I thought that having a second positive result when restricted to those who already had one would make it more likely that one had the disease than if we tested everyone twice indiscriminately. But the algebra bears out this coincidence of graphs. It doesn’t matter whether everyone is tested twice or just those who first have a positive result; the conditional probability of having the disease after two positive tests is the same either way. In the latter case, of course, far fewer total tests are administered.

Something we haven’t considered yet is what it means to have one positive and one negative test. Here the relative sizes of $p$ and $1-q$ matter. You can check that if $p + q = 1$, then having one positive and one negative test returns one’s likelihood of having the disease back to that of the overall population (because a sick person and a healthy person have the same chance of getting one positive and one negative result). However, if $q$ is greater than $1-p$ (that is, if a healthy person is more likely to have a false positive than a sick person is to have a false negative), then obtaining different results on two tests means one’s chance of having the disease is slightly less than in the overall population. One last graph, in which the red and blue curves from before reappear, together with a green curve representing the probability of having the disease given one positive and one negative test:

Conversely, if $q$ is less than $1 - p$, then the green curve would lie slightly above the diagonal.

The ideas we have been exploring are at the heart of Bayesian analysis, in which a certain assumption (called a prior) about how some characteristic is distributed is fed into a conditional probability model, and a new distribution is obtained. The new distribution becomes the new prior, and the process may be repeated. This kind of analysis depends on a Bayesian view of probability, in which the distribution represents a measure of belief (rather than any necessarily objective knowledge), and how that belief changes with the introduction of new knowledge. In our case, our prior was the assumption that the disease had prevalence $r$, and the new knowledge we introduced was the result of a medical test. This is the same kind of analysis—at a much more elementary level—that Nate Silver made famous (or perhaps that made Nate Silver famous) during recent election seasons. I must say, I was pleased that a student’s question led so neatly into this timely topic.

No comments: