Department of Information Engineering, Aston University, Birmingham B4 7ET, UK
Stevinson and her co-workers1 deal briefly with the issue of statistical power, stating that the lack of evidence from previous studies prevented them from carrying out a formal power calculation. They then argue that, since statistically significant effects have previously been observed in groups of 11-30 patients, a trial size of around 60, divided over three arms, should suffice for a preliminary study. What they signally fail to do is to follow the implications of their choice. A straightforward power calculation shows that so small a trial would have only a 1 in 4 chance of confirming the efficacy of such well-established conventional post-operative treatments as the use of tramadol for pain relief and ondansetron for nausea. Had the authors performed such a reality check, it would have put their negative finding in its proper context.
Furthermoreand contrary to the impression given by the authorsestimates of suitable trial sizes can be obtained even in the absence of prior insight into likely effect sizes. This is made possible by considering what constitutes a worthwhile effect. A therapy that outperforms placebo in a high proportion of patients is clearly more worthwhile than one that does not. In quantitative terms, a worthwhile therapy is thus one requiring a relatively low number needed to treat (NNT)i.e. the number of patients who need to receive the therapy in order for it to benefit one patient. In the case of pain relief, for example, therapies with NNTs as high as 5 are still considered effective.
Taking an NNT of 5 to be a reasonable upper limit in the case of arnica leads directly to an estimate of appropriate trial size. Using standard statistical power theory, one can show that a randomized placebo-controlled trial needs around 100 patients per armi.e. a total of around 200 patientsin order to detect a clinically worthwhile effect with the standard 80% power. Smaller trialsall too common in complementary medicineface a substantial risk of failing to detect worthwhile effects. One can show that a 50-50 rule applies, in which placebo-controlled trials with fewer than 50 patients per arm face a greater than 50% chance of failing to detect a worthwhile effect. This is not to say that small studies are worthless; when combined in a meta-analysis, they can provide useful insights. The fact remains, however, that the size of individual trials capable of detecting worthwhile effects is considerably larger than many seem to believe.
Certainly, if trials as small as those typically adopted in studies of complementary therapies had been used to assess the value of well-established conventional therapies, the shelves of hospital pharmacies would look decidedly bare.
REFERENCES
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||