25, 26, 27, 28, 29). THE LACK OF CONSENSUS WAS CLEARLY EVIDENT IN...

24, 25, 26, 27, 28, 29). The lack of consensus was clearly evident in a recent survey

of test accuracy reviews found in Database of Abstracts of Reviews of Effectiveness

(DARE) from 1994-2000, which showed that pooled sensitivity or specificity was

used in 58%, summary receiver operating characteristic (sROC) plots in 73%, pooled

predictive values in 18%, pooled likelihood ratios (LRs) in 22%, and pooled

diagnostic odds ratio in 8% of the meta-analyses (29).

From meta-analysis, it should be possible to interpret the result in terms of clinical

importance (not just statistical significance). In this respect, LR (25, 26, 27), is

believed to represent an improvement over sensitivity, specificity, and predictive

values. Many authorities considered pooling of sensitivity, specificity and predictive

values as inappropriate as they do not behave independently. On the other hand,

pooled (or summary) LRs can be used within a clinical context is shown in Table 4.

An example of clinical application of pooled likelihood ratios

Posttest

Pretest

Population &

Likelihood Ratio

Probability (95%

Outcome Measure

(95% CI)

CI)

Delivery <34

weeks’gestation

Positive test result

32.5 (24.2-40.8)

2.6 (1.8-3.7)

55.6 (43.4-67.3)

Negative test result

32.5 (24.2-40.8)

0.2 (0.1-0.5)

8.2 (3.1-20.1)

Delivery within 1 week

of testing

Positive result

6.6 (4.3-8.9)

5.0 (3.8-6.4)

25.8 (18.0-35.5)

Negative result

6.6 (4.3-8.9)

0.2 (0.1-0.4)

1.2 (0.4-3.1)

Based on Chien et al 30

However potentially misleading summary LRs might be obtained from pooling LRs

obtained from studies with extreme and diverging prevalence. An alternative way of

summarising the average performance of a dichotomous test from multiple studies

(particularly those with different thresholds) is to produce a sROC plot. This test

takes into account the variation in prevalence and is the preferred meta-analytic

method of many experts. The area under curve of a sROC is a mathematical

representation of the average accuracy of the test. However, unlike summary LRs,

sROC does not lend itself readily to clinical application. Due to lack of consensus

about the most appropriate summary measures it may be prudent to use both

summary LRs and sROC for performing meta-analysis.

CONCLUSION

Many existing reviews of test accuracy offer limited guidance for practice because

they do not apply a rigorous scientific methodology to limit bias in their assembly,

appraisal, and synthesis of primary studies. In this paper, we have described

methods for conducting a high quality test accuracy review. By understanding this

process, readers should be able to appraise test accuracy reviews with an informed

mind thus minimising erroneous inferences.

REFERENCES