Letter Urging Against Approval of Vagus Nerve Stimulator for Treatment of Depression
May 11, 2005
Daniel Schultz, M.D.
Director, Center for Devices and Radiologic Health
Food and Drug Administration
9200 Corporate Blvd.
Rockville, MD 20850
Dear Dr. Schultz:
We strongly urge you to reject the pending Premarket Approval (PMA) application for Cyberonics’ NeuroCybernetic Prosthesis (NCP) System, more commonly known as Vagus Nerve Stimulation (VNS) Therapy, for the new indication of treatment-resistant depression (TRD), because the safety and efficacy of this medical device have not been established. The only randomized trial conducted (study D02) failed to demonstrate the efficacy of the VNS device for TRD, and the nonrandomized efficacy analysis submitted by the sponsor was deemed “highly questionable” by the FDA statistician. The sponsor also has not demonstrated long-term safety in the TRD patient population, nor have concerns over numerous reports in clinical trials of worsening depression, suicides and sudden deaths with VNS treatment been adequately investigated. The FDA issued a non-approvable letter to Cyberonics in August 2004, but, in a highly unusual reversal of position, the FDA sent the sponsor an approvable letter in February 2005.
The VNS device consists of a pulse generator implanted in the base of the neck that delivers electrical signals to the left cervical vagus nerve, a major nerve with connections to the heart, the gastrointestinal tract, the brain, and many other parts of the body. An external device used by the physician adjusts the settings of the pulse generator, but the device is typically programmed to be on for 30 seconds and off for 5 minutes. No definitive mechanism of action has been established for the depression indication. The manufacturer estimates that about 30,000 epilepsy patients are currently using the device, and have accumulated about 79,000 patient-years of experience. By expanding into the TRD market, Cyberonics hopes to gain access to over 4 million new customers with depression.
In July 1997, VNS therapy was approved by the FDA for the treatment of refractory epilepsy in adults and adolescents 12 years and older. During follow-up visits in the early epilepsy trials, many of the participants stayed in the same Florida hotel. The hotel clerk supposedly observed that the mood of the VNS patients seemed to improve over time. Eyeing a potentially huge expansion of their market, Cyberonics began to conduct a series of studies to investigate VNS therapy for depression.
On October 27, 2003, Cyberonics submitted a PMA requesting approval for patients 18 years or over with chronic or recurrent depression who had failed to respond to two or more antidepressant treatments. Much of the correspondence between FDA and the sponsor concerning VNS treatment for depression is not accessible to the public, but the available FDA documents and company press releases provide an outline.
On March 1, 2004, the FDA completed an initial review of the PMA Supplement for the depression indication and responded to the sponsor with a Major Deficiency letter, citing serious flaws in the data submitted. Cyberonics was permitted to respond to these concerns, and a meeting of the Neurological Devices Panel of the Medical Devices Advisory Committee was called for June 15, 2004.
The advisory committee panel had very mixed feelings about the VNS data. After the panel discussed the risk-benefit ratio of the device, Chairperson Dr. Kyra Becker concluded prior to the vote:
In summary … it sounds like the panel believes that the device is generally safe but based on what is questionable efficacy, it’s unclear whether the safety benefit ratio rises to the point that make it something that we should achieve to use.
Two panelists voted against approval. “I voted against because I thought it should not have been approved, because I didn’t think they demonstrated efficacy,” commented Dr. Richard Malone. Similarly, Dr. Jonas Ellenberg concluded that “I don’t believe that a standard for efficacy has been met.”
Five panelists voted to recommend conditional approval (conditions are explained below). However, in their explanations of their votes, only two panelists expressed, somewhat cautiously, the belief that the data demonstrated efficacy. “Although I would have liked to have seen a more rigorous study,” said Dr. Laura Fochtmann, “I think there has been evidence shown that this is efficacious” Dr. Phillip Wang based his vote on the D02 randomized controlled trial primary endpoint, claiming a trend toward efficacy despite the actual lack of a statistically significant difference between VNS and control: “I’m voting for this on the basis of mainly the acute phase D02 data which supports that there is some efficacy, although albeit not particularly robustly .... (T)his is probably an improvement over, say, fourth line, fifth line sort of treatments.”
The other three panel members recommending approval expressed discomfort with the quality of the efficacy data, but were willing to look past this because of the few treatment options available for patients who don’t respond to other depression treatments. “Although it would be nice to have randomized controlled data for the efficacy,” remarked Dr. Mary Lee Jensen, “I believe this is a difficult patient population … And I think [VNS] should at least be available to this group of patients.”
Dr. Annapurni Jayam-Trouth concluded:
The reason I’m voting for approval with conditions is that I mean this is a very tough group of patients, and it’s difficult to treat them. The death rate is very high, it’s almost a terminal type of condition, more or less, and I think that there’s very little that we can offer at this time, and I think we have shown that this is relatively safe. There have been studies on epilepsy showing that it is efficacious in this group of patients and the efficacy seems to improve over time. So I think that for the reasons I mentioned, I’m voting for approval.
Similarly, Dr. Irene Ortiz explained that “I’m voting in favor because I feel that treatment-resistant depression does have a very high incidence of suicide. The data was not ideal, but safety I think was established.” In essence these three panel members implied that it didn’t matter so much whether the device worked, since these patients were suffering so much already and VNS didn’t appear to be unsafe.
The final vote was thus 5-2 in favor of recommending approval with the following conditions:
- Patients must fail four or more adequate trials of antidepressant therapy.
- Surgeons implanting the device must have skills operating within the carotid sheath.
- Clinicians caring for VNS patients must receive training in setting device parameters.
- Patients must be educated about device complications.
- The sponsor must establish a registry to collect clinical data on safety and efficacy, as well as stimulation levels and prognostic factors.
- The labeling must reflect the questionable efficacy data, including that the “positive” efficacy study was an open-label trial and not a randomized, controlled trial.
In an unusual step, on August 12, 2004, the FDA went against these Advisory Committee recommendations and sent a letter to Cyberonics stating that its PMA was not approvable. The letter is not available to the public, but according to the company it cited reasons for this decision that included 1) Concerns over data showing worsening depression in many VNS patients, 2) Potential biases stemming from a nonrandomized control, and 3) An inability to distinguish VNS effects from the placebo effect and concomitant treatment effects in the nonrandomized data. Cyberonics responded that it was “shocked and bewildered” by this decision, announcing that it was “in the process of arranging a meeting with senior FDA management to discuss their letter.” The sponsor claims to have then submitted more data to the FDA, but as best can be determined from Cyberonics’ press releases this adds little more than a one-year extension of the nonrandomized data already submitted and described below.
In the meantime, the FDA has also reprimanded Cyberonics for not adequately investigating deaths and other adverse events in VNS patients with epilepsy. In a Warning Letter sent on December 22, 2004, the FDA cites Cyberonics for a long list of violations including “failure to completely investigate and evaluate the cause of each medical adverse event.” Cyberonics has since responded to this Warning Letter and claims to have satisfied the FDA’s concerns.
In a mysterious reversal of the FDA’s earlier non-approvable letter, the FDA sent another letter to the sponsor on February 2, 2005, declaring the device approvable. This letter is also not public but according to Cyberonics it states that approval would be conditional upon the development of a postmarketing study to evaluate proper voltage levels for the device, new labeling revisions, and the establishment of a 1,000-patient registry. Cyberonics expects full approval by May 2005.
Studies submitted to the FDA
D01 Pilot Study
This was a non-randomized, open-label trial of 60 TRD patients acting as their own controls. After a baseline assessment period, patients were surgically implanted with the VNS device and then entered a 12-week acute phase. With two weeks to recover and two more weeks to adjust stimulation, this really amounted to eight weeks of sustained stimulation. Patients maintained a stable concomitant antidepressant/mood disorder medication regimen. Responders were defined as those with an over 50% improvement in their Hamilton Rating Scale for Depression (HRSD, also known as HAM-D) scores from baseline to the end of the acute phase.
Eighteen of 59 patients (31%) who completed the acute phase were classified as responders. The researchers followed the patients for two more years, finding that 25/55 patients (45%) were responders at one year, and 18/42 (43%) were responders at two years. Seventeen of 59 patients (29%) were lost to follow-up at two years, meaning that the response rate may actually have been as low as 18/59 patients (31%) at two years. Given the lack of a control group for these data, it is impossible to separate these response rates from the placebo effect, regression to the mean, or any secular effects.
Over the course of the up-to-two years for which patients were monitored, a total of 77 “serious” adverse events (a category including worsening depression, suicide attempts, and mania) were reported among the 60 patients, and every implanted patient reported at least one treatment-emergent adverse event. In total, 34 patients (57%) reported adverse mood alteration, including worsening depression. Twelve patients (20%) attempted suicide. These rates are cause for concern, but impossible to definitively attribute to VNS because of the lack of a control group.
D02 Pivotal Study
This was a 12-week double-blind, randomized, placebo-controlled phase III study of TRD patients. Two-hundred and thirty-five patients were implanted with the VNS device, 119 of whom were randomized to therapy and 116 to sham treatment (no stimulation). As in the pilot study, after two weeks to recover and two more weeks to adjust stimulation, the patients underwent eight weeks of sustained stimulation in combination with a constant concurrent antidepressant/mood disorder medication regimen. Responders were again defined as those with at least a 50% improvement in HRSD scores from baseline to the end of the acute phase. When the acute phase was unblinded, 17/111 treatment patients (15%) and 11/110 placebo patients (10%) had responded according to the HRSD criteria, a difference that was not statistically significant (p=0.238). Of four depression assessment test endpoints designated by the sponsor as secondary prior to the study, three were negative and only one (Inventory of Depressive Symptomatology Self-Report, or IDS-SR, which is not physician-administered) showed a statistically significantly better rate of response for VNS (19/109) over placebo (8/106, p=0.032). There was one suicide in the treatment group and none in the placebo group.
It is also interesting to note here the difference between the HRSD response rates of VNS-treated patients at 12 weeks in the unblinded D01 study (31%) and the blinded D02 study (15%). (Even this randomized, controlled phase of the D02 study does not achieve the level of blinding that would be ideal. Many patients receiving VNS stimulation can sense that the device is active, thereby exaggerating efficacy.)
These data suggest that at least some of the apparent effectiveness of VNS is due to patients’ knowing if they were receiving treatment.
This was a 12-month, open-label follow-up to the D02 acute phase, in which all implanted patients from the acute phase could elect to receive stimulation. Of 205 patients participating at the onset of the long-term phase, a total of 177 completed the entire 12 months of treatment. Adjustments in patients’ medication regimen were allowed, as was concurrent Electroconvulsive Therapy (14 patients). After 12 months of therapy, 52/174 patients (30%) met the response criteria of at least a 50% improvement in HRSD scores. Thirty-one of 205 patients (15%) were lost to follow-up, meaning that the response rate may actually have been as low as 52/205 (25%). A total of 96 serious adverse events were reported during the long-term phase, including 62 patients reporting worsening depression (30%) and 11 suicide attempts. As in the D01 trial, the lack of a proper control group limits the meaning of these data, due to the difficulty separating response rates from placebo effect, regression to the mean, and other secular effects.
Comparison of D02 Long-Term Outcomes to an Observational Study (D04)
Despite the failure of VNS to demonstrate efficacy with respect to the primary outcome in the randomized, controlled D02 Acute Phase trial, Cyberonics asserted that there might be a pattern of increasing response over time, as demonstrated, they claimed, by the long-term follow-up phase of D02. Therefore efficacy might only be detectable in longer-term data. But instead of conducting a longer-term randomized controlled trial, the logical response to this assertion, the sponsor elected to generate a new control.
It took the 12-month D02 Long-Term data of VNS-treated patients it already had and decided to collect new “usual care” data (i.e., medication, ECT, etc. administered at the physician’s discretion to unimplanted patients). Thus, instead of attempting to prove its claim of efficacy over the long-term by collecting data on implanted patients in a randomized controlled trial over the long-term, the sponsor collected more data on unimplanted patients.[*]
In its protocol for this new comparison, the sponsor also elected to redefine the primary endpoint. In the D02 acute phase, the only one of the five primary and secondary endpoints to suggest efficacy was the self-administered IDS-SR.4 Cyberonics then used IDS-SR as the primary endpoint in its revised D02-D04 analysis, instead of HRSD. While the HRSD is one of the standard assessment tools used in depression trials, IDS-SR is obscure. A review of the labels of the ten most recent antidepressant drugs approved by the FDA reveals no mention of IDS-SR as an outcome variable whatsoever. In contrast, the HRSD was cited eight times. The FDA has raised serious questions about the validity of using IDS-SR as a predictor of HRSD scores. The FDA statistical review’s analysis found that there was “questionable concordance” between IDS-SR and HRSD in the D02 study. The FDA review added that “we do not agree that concordance studies reported in the published literature are sufficient to support IDS-SR as a ‘good predictor’ of HRSD-24.” In addition to this switch to a more favorable assessment tool, the sponsor switched from a dichotomous outcome variable to a continuous one. All other things being equal, analyses of continuous outcome variables are more likely to reach statistical significance; the differences thus detected may not be clinically meaningful.
In the D04 observational study, 112 patients completed 12 months of usual care. In the D02-D04 primary endpoint comparison, using linear regression analyses, a statistically significant difference (p<0.001) was found between the change in IDS-SR scores per month in favor of D02 patients at 12 months (-0.397 estimated average difference per month, IDS-SR scale is 0-84). However, if these data are censored for changes in concomitant antidepressant treatment (i.e., the patient’s last IDS-SR score before the concomitant antidepressant treatment was changed is used for subsequent assessment points, a last-observation-carried-forward approach), there is no statistically significant difference between usual care and VNS plus usual care. For the HRSD secondary endpoint response analysis, 13/104 patients (13%) in the D04 study were responders as defined by a greater than 50% improvement in HRSD scores between baseline and 12 months. This was statistically significantly lower than the HRSD response rate for VNS-treated patients in D02 (30%). Again, however, when the data were censored for changes in concomitant antidepressant treatment, statistical significance is lost.
This long-term D02-D04 analysis is extremely problematic for a number of reasons. The purpose of a randomized controlled trial is to ensure that the groups to be compared are as close to identical as possible, to eliminate confounding. But there were very significant differences between the two groups in this non-randomized comparison.
First, the illness and treatment histories of the two groups were not the same. As the FDA’s statistical review points out: “There were significant differences in baseline demographics between D02 and D04, including those patients who received ECT during their lifetimes, patients who received ECT during the current major depressive episode, and patients in the control population with greater than 10 lifetime episodes of depression.” These factors limit the ability to isolate the effect of the VNS treatment on these patients.
Second, the studies were not conducted on an identical timeline, meaning there may be differences in seasonal effects on depression, or in responses to news events.
Third, only 12 of the 22 study sites in D02 participated in D04 (“overlapping sites”), meaning the “usual care” may not always have meant the same thing. The FDA’s statistical reviewer was very concerned about this, concluding that “the validity of statistical inferences from comparison of two proportions of Responses pooled over all non-overlapping and overlapping sites, without any appropriate statistical modeling approach, such as meta-analysis, is highly questionable.”
Fourth, both the long-term phase of D02 and D04 allowed for changes in concomitant antidepressant therapy and ECT, which, themselves, could cause improved depression symptoms. As mentioned, when the primary endpoint data were censored for concomitant treatment changes, statistical significance was lost.
Fifth, the long-term D02-D04 comparison was by its nature unblinded. D02 VNS-treated patients were surgically implanted with a device they knew was being investigated as a novel depression treatment. By the long-term phase, they all knew the device was turned on. This could have an enormous psychological impact, increasing patients’ expectations of clinical improvement. By contrast, D04 patients did not undergo surgery. In depression, the placebo effect and regression to the mean are known to be unusually large. The response rate for long-term D02 VNS-treated patients was about 30% according to the initial endpoint criteria of at least a 50% improvement in HRSD scores. During the past 30 years, antidepressant drug trials have shown placebo response rates of around 30%. Granted, the D02 and D04 populations were by design treatment-resistant and may have experienced smaller placebo-response rates than otherwise expected. But one might also suspect that an implant would have a greater placebo effect than a drug. In any case, there is reason to suspect a substantial role for a placebo effect and regression to the mean in the implanted patients. Particularly given the modest magnitude of efficacy differences measured, these are crucial flaws in the study design.
The FDA statistical review of the D02-D04 efficacy data concludes with skepticism:
Due to above statistical issues, such as questionable concordance between HRSD-24 and IDS-SR, questionable pooling of multi-center data for comparison of proportions of responses, statistically insignificant findings from censored and overlapping sites … for IDS-SR primary effectiveness endpoint (Slope) and HRSD-24 secondary effectiveness endpoint (Response proportions), it is unclear whether the effectiveness claim of [VNS-treated] D-02 over [standard-of-care] D-04 group patients has been demonstrated.
Cyberonics did not systematically collect safety data in the D04 “usual care” study, meaning a D02-D04 safety comparison is not possible. The result of this, and the lack of long-term randomized controlled data, is that there are a lot of unanswered questions about the safety of the VNS device in TRD patients.
Of course, there is a risk of complications for any surgical procedure in a delicate area such as that around the left cervical vagus nerve in the neck, with the immediate vicinity including the carotid artery and jugular vein. Beyond this, after the point of stimulation, the vagus nerve goes on to innervate the larynx, heart, lungs, abdominal viscera, and brain, meaning all of these areas are potential sites for adverse effects from VNS. A review of the literature reveals case reports of vocal chord paralysis, loss of sensation in the pharynx, and breathing problems during sleep induced by VNS. Several cases of complete heart block with the initiation of stimulation have been reported, with Ali et al. hypothesizing a mechanism of VNS-induced ventricular asystole (lack of contraction of the heart).,
Cyberonics asked the FDA to consider long-term safety data from clinical trials for epilepsy instead of providing controlled data specific to the TRD patient population to be targeted. Dr. Michael Schlosser, the FDA medical officer who reviewed the safety of the device, was dissatisfied with this submission. “The safety of the device in another population doesn’t necessarily mean that that device is safe in this different population,” he noted. Even Cyberonics’ own scientist, Dr. John Rush, acknowledged that the epilepsy data did not determine safety for the TRD population when he defended the sponsor’s decision not to conduct a longer-term randomized controlled trial. “At the time that we started [the D02 trial, i.e. long after VNS was approved for epilepsy], where we were going we had no evidence of long-term safety or efficacy of VNS,” said Dr. Rush. “…We couldn't do a long-term. We didn't even have an idea of safety in the short run.” Patients with severe epilepsy and severe depression are very different populations. Especially when some of the most common serious adverse events in the depression studies were psychiatric (worsening depression, suicide attempts, etc.), it is irresponsible not to conduct long-term, controlled safety studies for this group.
Even without long-term randomized controlled safety data, there were red flags identified by the FDA in the depression data submitted. The occurrence of worsening depression in VNS-treated patients (the most common serious adverse event in the D02 study, reported in 30% of long-term patients) is very troubling, especially in light of questionable efficacy data. An FDA analysis combined safety data from D01, D02, and a postmarketing depression registry from Europe found an incidence of 24 suicide attempts in 689 patient-years (3.5% per year). The FDA summary and analysis cites as a comparator Khan et al., who report an incidence of suicide attempt per patient year of 2.7% in patients receiving placebo in depression drug trials. At the Advisory Committee meeting, the FDA’s Dr. Schlosser commented that the data are “enough to make us concerned that there might be something to precipitation of suicide by this device, or at least to look at it more carefully.”
The FDA also raised questions about the role of the VNS device in sudden death. “There is a concern that this might be due to cardiac events due to the direct vagal nerve stimulation,” said the FDA’s Dr. Schlosser at the Advisory Committee meeting. “Could this be causing a cardiac event that led to sudden death?” he asked. At best, the safety data for VNS in depression are incomplete.
The approval of VNS for this indication would represent a dramatic expansion of the market for VNS, from the current pool of roughly 30,000 epilepsy patients to the 4.4 million TRD patients Cyberonics often cites as desperately needing this new treatment. We strongly oppose this approval because there are no randomized controlled data demonstrating efficacy for the primary endpoint. The nonrandomized efficacy analysis is riddled with the potential for bias and confounding. The FDA statistical review repeatedly called Cyberonics’ analysis “questionable,” and concluded that it was not clear that efficacy had been established. The FDA Advisory Committee members agreed that a safety-benefit ratio could not be established with the data available. The FDA has raised questions about increased suicides, worsening depression, and sudden death, all of which deserve further investigation. The FDA would never approve a drug under these conditions. With so many uncertainties and red flags, it is a serious mistake for the FDA to be prepared to approve this device for use in millions more people for whom it has not been proved to work. Do not let justified empathy for this patient population lead to the unjustified approval of a device that does not come close to meeting FDA’s approval standards, and may well do more harm than good.
Peter Lurie, M.D., M.P.H.
Sidney Wolfe, M.D.
Public Citizen’s Health Research Group
[*]The company initially claimed at the Advisory Committee meeting that to conduct a longer-term randomized controlled trial with fixed medications would have been unethical. However, as pointed out by several Advisory Committee panelists, there are many potential ways around this problem. These include allowing medication changes according to a fixed protocol; censoring the data if a medication is changed and using last-observation-carried-forward; using multivariate analysis to adjust for concomitant treatment effects; analyzing the need for additional therapy as an outcome; and not allowing concomitant treatment changes, but allowing for an escape if symptoms worsen. When Advisory Committee panelists suggested such alternative study designs, the company acknowledged that such a trial was possible.
 Carpenter LL, Friehs GM, Price LH. Cervical Vagus Nerve Stimulation for Treatment-Resistant Depression. Neurosurgery Clinics of North America. 2003 April;14(2):275-82.
 Theodore WH, Fisher RS. Brain Stimulation for Epilepsy. The Lancet – Neurology. 2004 February;3:111-118.
 Khan A, Warner HA, Brown WA. Symptom Reduction and Suicide Risk in Patients Treated with Placebo in Antidepressant Clinical Trials: An Analysis of the Food and Drug Administration Database. Archives of General Psychiatry. 2000 April;57(4):311-7.
 Vassilyadi M, Strawsburg RH. Delayed Onset of Vocal Cord Paralysis After Explantation of a Vagus Nerve Stimulator in a Child. Child’s Nervous System. 2003 April;19(4):261-3.
 Akman C, Riviello JJ, Madsen JR, Bergin AM. Pharyngeal Dysesthesia in Refractory Complex Partial Epilepsy: New Seizure or Adverse Effect of Vagal Nerve Stimulation? Epilepsia. 2003 June;44(6):855-8.
 Holmes MD, Chang M, Kapur V. Sleep Apnea and Excessive Daytime Somnolence Induced by Vagal Nerve Stimulation. Neurology. 2003 October 28;61(8):1126-9.
 Ali II, Pirzada NA, Kanjwal Y, Wannamaker B, Medhkour A, Koltz MT, Vaughn BV. Complete Heart Block with Ventricular Asystole During Left Vagus Nerve Stimulation for Epilepsy. Epilepsy and Behavior. 2004 October;5(5):768-71.
 Tatum WO 4th, Moore DB, Stecker MM, Baltuch GH, French JA, Ferreira JA, Carney PM, Labar DR, Vale FL. Ventricular Asystole During Vagus Nerve Stimulation for Epilepsy in Humans. Neurology. 1999 April 12;52(6):1267-9.