25, 26, 27, 28, 29). THE LACK OF CONSENSUS WAS CLEARLY EVIDENT IN...
24, 25, 26, 27, 28, 29). The lack of consensus was clearly evident in a recent survey
of test accuracy reviews found in Database of Abstracts of Reviews of Effectiveness
(DARE) from 1994-2000, which showed that pooled sensitivity or specificity was
used in 58%, summary receiver operating characteristic (sROC) plots in 73%, pooled
predictive values in 18%, pooled likelihood ratios (LRs) in 22%, and pooled
diagnostic odds ratio in 8% of the meta-analyses (29).
From meta-analysis, it should be possible to interpret the result in terms of clinical
importance (not just statistical significance). In this respect, LR (25, 26, 27), is
believed to represent an improvement over sensitivity, specificity, and predictive
values. Many authorities considered pooling of sensitivity, specificity and predictive
values as inappropriate as they do not behave independently. On the other hand,
pooled (or summary) LRs can be used within a clinical context is shown in Table 4.
An example of clinical application of pooled likelihood ratios
Posttest
Pretest
Population &
Likelihood Ratio
Probability (95%
Outcome Measure
(95% CI)
CI)
Delivery <34
weeks’gestation
Positive test result
32.5 (24.2-40.8)
2.6 (1.8-3.7)
55.6 (43.4-67.3)
Negative test result
32.5 (24.2-40.8)
0.2 (0.1-0.5)
8.2 (3.1-20.1)
Delivery within 1 week
of testing
Positive result
6.6 (4.3-8.9)
5.0 (3.8-6.4)
25.8 (18.0-35.5)
Negative result
6.6 (4.3-8.9)
0.2 (0.1-0.4)
1.2 (0.4-3.1)
Based on Chien et al 30
However potentially misleading summary LRs might be obtained from pooling LRs
obtained from studies with extreme and diverging prevalence. An alternative way of
summarising the average performance of a dichotomous test from multiple studies
(particularly those with different thresholds) is to produce a sROC plot. This test
takes into account the variation in prevalence and is the preferred meta-analytic
method of many experts. The area under curve of a sROC is a mathematical
representation of the average accuracy of the test. However, unlike summary LRs,
sROC does not lend itself readily to clinical application. Due to lack of consensus
about the most appropriate summary measures it may be prudent to use both
summary LRs and sROC for performing meta-analysis.
CONCLUSION
Many existing reviews of test accuracy offer limited guidance for practice because
they do not apply a rigorous scientific methodology to limit bias in their assembly,
appraisal, and synthesis of primary studies. In this paper, we have described
methods for conducting a high quality test accuracy review. By understanding this
process, readers should be able to appraise test accuracy reviews with an informed
mind thus minimising erroneous inferences.
REFERENCES