## Tuesday, November 8, 2011

### Diagnostic and Screening Test Validity

Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) constitute the four primary mechanisms for assessing the validity of a diagnostic or screening test.  You see the words "sensitivity" and "specificity" bandied about in the medical literature and discussed (albeit briefly) in some epidemiology or biostatistics textbooks, but I had yet to encounter as concise, well-written, and elegantly explained description of these diagnostic tools as the one in chapter eight of Trisha Greenhalgh's "How to Read a Paper:  The Basics of Evidence-Based Medicine".  (I had been considering a blog post on this topic for quite some time but hadn't gotten around to it -- big surprise -- until a recent email exchange with my adviser re: my difficulty in developing my statistical analysis plan and her subsequent clarification that my analysis should include a "predictive value" component.  The "predictive value" I eventually settle on may have little to do with sensitivity, specificity, PPV, and NPV, nevertheless, these concepts form the foundation that the aforementioned will draw upon.  This blog post borrows heavily from Greenhalgh's text.)

The introduction of sensitivity and specificity can range from the use of conditional probability statements to the drawing up of a "jury verdict versus true criminal status" 2x2 table.  I learned sensitivity/specificity both ways and found that they complemented each other and enhanced my understanding of a concept that arises in the epi and medical literature with tremendous frequency.  In the "jury verdict versus true criminal status" method, a table is drawn up such that all possible outcomes are presented in the four cells of a 2x2 table.  In an ideal world, all murderers would be rightly convicted and those innocent would be rightly acquitted.  But the ideal world is rarely realized so we compute statistics to summarize the quality of a test, establish benchmarks, and either choose to ignore or use a test depending on its validity.  In this example, the sensitivity is the proportion of murderers that were convicted -- a/(a + c) -- whereas the specificity is the proportion of non-murderers acquitted -- d/(b + d).  The PPV is the probability that someone convicted of murder actually did it and the NPV is the probability that a person acquitted is actually innocent.

 True Criminal Status Murderer (a + c) Not murderer (b + d) Jury Verdict Guilty (a + b) rightly convicted (a) wrongly convicted (b) Innocent (c + d) wrongly acquitted (c) rightly acquitted (d)

More formally, the definitions -- along with their probability statements and mathematical calculations assuming a 2x2 table configuration -- are presented below:

 Test Primary Name Alternative Name Central Question the Test Answers Conditional Probability Statement* Test Formula** Sensitivity True Positive Rate How good is test at identifying those with the condition? P(T+|D+) = P(T+ ∩ D+) P(D+) a/(a + c) Specificity True Negative Rate How good is test at excluding those without the condition? P(T-|D-) = P(T- ∩ D-) P(D-) d/(b + d) Positive Predictive Value (PPV) Post-test Probability of Positive Test What is probability of having condition if test is positive? P(D+|T+) = P(D+ ∩ T+) P(T+) a/(a + b) Negative Predictive Value (NPV) Post-test Probability of Negative Test What is probability of not having condition if test is negative? P(D-|T-) = P(D- ∩ T-) P(T-) d/(c + d)
* P denotes probability, T denotes test, D denotes disease, and the + and – indicate positivity or negativity.
** The letters a, b, c, & d correspond to the four cells of a 2x2 table where a is the upper-left, b is the upper-right, c is the lower-left, and d is the lower-right.

Symbolically, the data can (and ought) to be presented by way of a 2x2 table:

 Reference Criterion/Condition/Disease Diseased (a + c) Not Diseased (b +d) Test Result Positive (a + b) True Positive (a) False Positive (b) Negative (c + d) False Negative (c) True Negative (d)

Now consider the example presented by Greenhalgh to illustrate the calculation and interpretation of sensitivity, specificity, PPV, and NPV.  The data are presented below with the calculations following:

 Gold Standard Glucose Test (2h OGTT) Diseased (6 + 21 = 27) Not Diseased (7 + 966 = 973) Glucose Test Result Glucose Present (6+7=13) 6 7 Glucose Absent (21 + 966 = 987) 21 966

Sensitivity:  6/27 = 22.2%
Specificity:  966/973 = 99.3%
Positive Predictive Value (PPV):  6/13 = 46.2%
Negative Predictive Value (NPV):  966/987 = 97.9%

In this scenario, the sensitivity is lousy but the specificity is quite good.  That is, the test captures only about a fifth of those that are actually diseased whereas it identifies nearly all of those that are actually disease-free.  The PPV is the probability the person is actually diseased given that they have glucose present -- a value this low would warrant a second or follow-up test -- whereas the NPV, although not 100%, indicates that the probability of not being diseased, considering the negative test result, is quite high.

One final point -- and another crucial distinction Greenhalgh makes between sensitivity/specificity and PPV/NPV that is sometimes glossed over in other texts is this:

"As a rule of thumb, the sensitivity or specificity tells you about the test in general, whereas the predictive value tells you about what a particular test result means for the patient in front of you.  Hence, sensitivity and specificity are generally used more by epidemiologists and public health specialists whose day-to-day work involves making decisions about populations" (pp. 103, emphasis in original text).