SYNTHESISING TEST ACCURACY DATASELECTED STUDIES EVALUATING TEST ACC...

4. SYNTHESISING TEST ACCURACY DATA

Selected studies evaluating test accuracy must provide data on comparison of the

test with the gold standard in sufficient detail to allow generation of 2x2 tables for

computation of possible accuracy indices. For example, 2x2 tables of the cervico-

vaginal fibronectin test result (positive or negative) and spontaneous preterm birth

(present or absent) could be produced from each study. Reviewers must obtain

missing information from primary investigators. Once the numerical data has been

obtained from the various primary studies, the next steps will be exploration of

variation in results from study to study (heterogeneity) followed by, if appropriate,

synthesis of their results (meta-analysis).

Any variation in results between different studies (heterogeneity) should be

investigated. There is likely to be some heterogeneity in population, test, gold

standard, and study quality. Conclusions have to be made cautiously if there is

significant heterogeneity. Many statistical (12, 13), methods exists to detect whether

the apparent differences in test accuracy among studies are due to chance alone.

However it is recognised that statistical methods tend to have limited power to

detect heterogeneity (14). Therefore it has been recommended that graphical

methods (15, 16, 17), should also be used to explore heterogeneity (18). This may

involve an exploration of the relationship between sensitivities and specificities for

the various studies included in the meta-analysis. Examination of the causes of

heterogeneity should be planned a priori; otherwise it may be open to bias.

Essentially, there are two practical approaches. First, subgroup analyses can be

conducted to see whether variations in population, test, outcomes and study quality

between different studies affect the estimate of diagnostic accuracy. (19, 20).

Second, meta-regression analysis may be performed to determine which one of the

several variables considered to be important a priori;account for the differences

between the studies (21). Where heterogeneity remains unexplained, one should

perform data synthesis and interpretation with caution.

In meta-analysis, results from individual studies are pooled together mathematically

to generate a summary or pooled result. The various summary measures used to

report the pooled results are shown in Table 3.

Summary measures and their use in meta-analysis of test accuracy studies

using dichotomous results

Summary measures

Proportion*

Summary sensitivity (true positive rate)

58%

A method of combining the results from primary studies of the

proportion of people with disease that is correctly identified as such,

independent of specificities.

Summary sensitivity (true negative rate)

58%

independent of sensitivities.

Summary receiver operating characteristics curve (sROC)

73%

A method of combining sensitivity and specificity results from

individual primary studies that takes into account their relationship

between these two measures. The result, which is the average

accuracy of the test, obtained by this method is usually presented as

area under the curve. This method provides a graphical illustration to

the overall accuracy of the test and defined a point where the test was

at its most accurate.

Summary predictive values

18%

proportions of test positive (or negative) people who truly have (or do

not have) disease.

Summary likelihood ratios

22%

A method of combining the results from primary studies of the ratio of

the probability of a positive (or negative) test result in the patients

with disease to the probability of the same test result in the patients

without the disease

Summary diagnostic odds ratio

8%

the odds of a positive test result in patients with disease compared to

the odds of the same test result in patients without disease.

*based on Honest et al 29

Whilst conceptually straightforward, in practice, there is debate about how best to

statistically summarise results from several primary test accuracy studies. (2, 22, 23,