### Diagnosis Scenario

You are a nurse practitioner in a primary care setting. You are familiar with a number of validated instruments to detect depression. However, a colleague describes a 2-question instrument that she feels is as effective in detecting probable cases of major depression and would be much quicker to use. You decide to investigate the instrument and its properties further.

You pose the question, in patients with suspected depression what is the accuracy of a two-question case-finding instrument for depression compared with six previously validated instruments?

Searching terms and evidence source:
You search MEDLINE using the terms “depression” or “depressive disorder” and come up with 10537 articles. Because you are interested in finding this 2-question tool, you add “sensitivity and specificity” and “questionnaire”. When these terms are joined, you still have 58 articles – too many to quickly find the tool. When you join the above search terms with “primary care” as a text word, or “primary health care”, the results are narrowed to six, a more manageable retrieval situation. From the six, you find the following paper: Whooley MA, Avins, AL, Miranda J, Browner WS. Case-finding instruments for depression: Two questions are as good as many. J Gen Intern Med 1997;12:439-45.

• Is the evidence from this study valid?
• If valid, is this evidence important?
• If valid and important, can you apply this evidence in caring for your patient?

Whooley MA, Avins, AL, Miranda J, Browner WS. Case-finding instruments for depression: Two questions are as good as many. J Gen Intern Med 1997;12:439-45.

#### Are the results of this diagnostic study valid?

Was there an independent, blind comparison with a reference (“gold”) standard of diagnosis?
Yes. They used a computerized version (QDIS) of the National Institute of Mental Health Diagnostic Interview Schedule (DIS) which has a sensitivity of 80% and a specificity of 84% compared with DSM-III criteria for depression. It is a 20-minute diagnostic interview that was administered by one of 3 trained psychology students who were blind to the results of the case-finding instruments. The DIS has good test-retest reliability. When a subset of 20 patients were interviewed by all 3 interviewers, the inter-rater reliability was very good (kappa=0.88).
Was the diagnostic test evaluated in an appropriate spectrum of patients (like those in whom it would be used in practice)?
No. The sample consisted of 590 consecutive patients visiting an urgent care clinic. Prevalence of depression in this sample was 18%, which is higher than other primary care settings. 97% of patients were men, and of these >70% were not working even though the mean age was 53. Further testing in a more typical primary care setting, with more equal distribution of men and women, and more employed people is warranted.
Was the reference standard applied regardless of the diagnostic test result?
Yes. Each patient completed the reference standard interview (the 2-question instrument and the 6 case-finding instruments) during one sitting lasting approximately 45 minutes. The results of one test had no influence on whether another test was performed.

#### Are the valid results of this diagnostic study important?

Target Disorder
Depression
Totals
Present Absent
Diagnostic
Test Result
2-question instrument
Positive 93
a
189
b
282
a + b
Negative 4
c
250
d
254
c + d
Totals a + c
97
b + d
439
a + b + c + d
536

begin{align}
mathit{Sensitivity} &= a/(a+c)\
&= 93/97\
&= 96%
end{align}

begin{align}
mathit{Specificity} &= d/(b+d)\
&= 250/439\
&= 57%
end{align}

begin{align}
text{Likelihood Ratio for a positive test result ($LR+$)} &= mathit{sens}/(1-mathit{spec})\
&= 0.96/1-0.57\
&= 2.2
end{align}

begin{align}
text{Likelihood Ratio for a negative test result ($LR-$)} &= (1-mathit{sens})/mathit{spec}\
&= 1-0.96/0.57\
&= 0.07
end{align}

begin{align}
text{Positive Predictive Value} &= a/(a+b)\
&= 93/282\
&= 33%
end{align}

begin{align}
text{Negative Predictive Value} &= d/(c+d)\
&= 250/254\
&= 98%
end{align}

begin{align}
text{Pre-test Probability ($prevalence$)} &= (a+c)/(a+b+c+d)\
&= 97/536\
&= 18.1%
end{align}

begin{align}
mathit{Pre-test-odds} &= mathit{prevalence}/(1-mathit{prevalence})\
&= 0.181/0.819\
&= 0.22
end{align}

begin{align}
text{Post-test odds} &= text{Pre-test odds} times text{Likelihood Ratio}\
&= 0.22 times 2.2\
&= 0.48
end{align}

begin{align}
text{Post-test Probability} &= text{Post-test odds}/(text{Post-test odds} + 1)\
&= 0.48/1.48\
&= 0.32
end{align}

#### Can you apply this valid, important evidence about a diagnostic test in caring for your patient?

Is the diagnostic test available, affordable, accurate, and precise in your setting?
The test is definitely affordable in terms of client and practitioner time and tools. It correctly identified those with depression 96% of the time and correctly ruled out depression when it did not exist in 57% of the cases. Translation of test properties to another clinical setting would vary with prevalence.
Can you generate a clinically sensible estimate of your patient’s pre-test probability (from practice data, from personal experience, from the report itself, or from clinical speculation)?
This report would help to generate such an estimate only if the population was primarily unemployed men. A chart review would give an estimate of pre-test probability.
Will the resulting post-test probabilities affect your management and help your patient? (Could it move you across a test-treatment threshold?; Would your patient be a willing partner in carrying it out?)
The post-test probability is 0.33, meaning that the patient is more likely to be depressed if the results of the test are positive, and should be considered for further assessment. The patients would be very likely to answer two questions. If the test was negative, this would virtually rule out depression and the post-test probability of depression would be about 1%.
Needs more testing for reliability; need to combine with additional assessment; not yet known whether early detection of depression will improve outcome.

Whooley et al suggest administering the questionnaire only to high-risk patients if it is too time consuming to administer to all patients. However, it was not tested in this way, and would need further testing on high-risk populations. Because of the high false-positive rate, other assessment would need to be done in conjunction with this test, if used for case-finding.

