Post Hoc Analysis Does Not Establish Effectiveness of rTMS for Depression
June 15, 2009
Jonas Z Hines, Peter Lurie and Sidney M Wolfe
This letter was published in Neuropsychopharmacology.
The Food and Drug Administration (FDA) recently cleared the first transcranial magnetic stimulation (TMS) device to treat depression in patients who have failed one antidepressant (Neuronetics, 2008). That decision seems to be based on an analysis very similar to the one reported by Lisanby et al (2009).
Lisanby characterizes her study as ‘an exploratory statistical approach’, but the more accurate description is the FDA's: ‘a post hoc evaluation’ (Food and Drug Administration, 2007a). Moreover, it is a post hoc subgroup evaluation of a published (O'Reardon et al, 2007) negative randomized controlled trial. As stated by FDA clinical trial expert Dr Robert Temple at another FDA advisory committee meeting, an ‘after-the-fact subset analyses in a study that did not win… is different from subset analyses in a study that did win’(Food and Drug Administration, 2005).
Lisanby, however, states that the published trial showed ‘TMS to be safe and efficacious’ (Lisanby et al, 2009). This is misleading. In the trial (O'Reardon et al, 2007) the difference between treatment arms was both statistically and clinically non-significant (p=0.057, 1.7 points on the 60-point Montgomery Asberg Depression Rating Scale) for the primary outcome (change in Montgomery Asberg Depression Rating Scale at 4 weeks). This finding only became statistically significant (p=0.038) after the post hoc exclusion of six patients even though they had met a priori inclusion criteria, an obviously inappropriate statistical maneuver.
Even if the trial had been positive, several intrinsic aspects of post hoc analyses make them particularly subject to bias.
First, findings that arise from a post hoc analysis are often used to guide future investigation, but should not themselves be interpreted as conclusive (Rothwell, 2005; Furberg and Furberg, 2007). Such analyses can be motivated by an earlier inspection of the data, which is a potential source of bias (Wang et al, 2007; Hayward, 2002). As Dr Thomas Brott, chairperson of the advisory committee that considered TMS, concluded, ‘trying to stratify in this fashion and draw conclusions when the overall P-value is in the range that it is, is done with great peril’ (Food and Drug Administration, 2007b).
Second, post hoc explorations can conceal unfavorable findings. The company presented data to the FDA advisory committee showing treatment efficacy separately for those with one to four earlier treatment failures. However, Lisanby combines those with two to four such failures, comparing that group with patients with only a single treatment failure. Dichotomizing treatment failure obscures the fact that the small group (n=12) with four earlier treatment failures also seemed to respond to TMS (p=0.022), a finding that undermines Lisanby's conclusion about the effect of treatment resistance on the response to TMS (Food and Drug Administration, 2007a).
Third, Lisanby did not justify why she did not adjust the significance level for multiple hypothesis testing when, according to her methods, at least 10 variables were tested.
In sum, Lisanby did not clearly identify her analysis as post hoc, mischaracterized the full trial as positive, and reached a conclusion of efficacy based on a post hoc evaluation that obscured important treatment variability while neglecting to account for multiple comparisons. Indeed, when presented with the same data, the FDA advisory committee concluded that TMS' ‘clinical effect was perhaps marginal, borderline, questionable, and perhaps a reasonable person could ask whether there was an effect at all’ and rejected the device (Food and Drug Administration, 2007b). It is concerning that FDA has cleared this device, particularly if patients are diverted from effective therapies such as antidepressant medications.
- Food and Drug Administration (2005). Transcript of the Oncologic Drugs Advisory Committee. (published online 4 March 2005, at www.fda.gov/ohrms/dockets/ac/05/transcripts/2005-4095T2.htm).
- Food and Drug Administration (2007a). Executive Summary for the Neurostar TMS Therapy. (published online 26 January 2007, at www.fda.gov/ohrms/dockets/ac/07/briefing/2007-
- Food and Drug Administration (2007b). Transcript of the Neurological System Devices Panel. (published online 26 January 2007, at www.fda.gov/ohrms/dockets/ac/07/transcripts/2007-4273t1.rtf).
- Furberg BD, Furberg CD (2007). Evaluating Clinical Research; All that Glitters is not Golden. Springer: New York.
- Hayward R (2002). Users' Guides Interactive. JAMA Publishing Group: Chicago. (available online at http://www.usersguides.org).
- Lisanby SH, Husain MM, Rosenquist PB, Maixner D, Gutierrez R, Krystal A et al (2009). Daily left prefrontal repetitive transcranial magnetic stimulation in the acute treatment of major depression: clinical predictors of outcome in a multisite, randomized controlled clinical trial. Neuropsychopharmacology 34: 522–534 (originally published online 13 August 2008, at www.nature.com/npp/journal/v34/n2/full/npp2008118a.html). | Article | PubMed |
- Neuronetics (2008). Press Release- FDA Clears Neurostar TMS Therapy for the Treatment of Depression. Neuronetics: Malvern, PA. (published online 8 October 2008, at http://www.neuronetics.com/pdf/FDA_Clears_NeuroStar
- O'Reardon JP, Solvason HB, Janicak PG, Sampson S, Isenberg KE, Nahas Z et al (2007). Efficacy and safety of transcranial magnetic stimulation in the acute treatment of major depression: a multisite randomized controlled trial. Biol Psychiatry 62: 1208–1216. | Article | PubMed |
- Rothwell PM (2005). Treating individuals 2. Subgroup analysis in randomised controlled trials: importance, indications, and interpretation. Lancet 365: 176–186. | Article | PubMed | ISI |
- Wang R, Lagakos SW, Ware JH, Hunter DJ, Drazen JM (2007). Statistics in medicine; reporting of subgroup analyses in clinical trials. N Engl J Med 357: 2189–2194. | Article | PubMed | ChemPort |