New England Journal of Medicine Violates Its Own Policy, Publishes Results of Unethical Clinical Trial That Put Resident Doctors at Risk

Statement of Dr. Michael Carome, Director, Public Citizen’s Health Research Group

Note: Today, The New England Journal of Medicine (NEJM) published an online article presenting results of the Flexibility in Duty Hour Requirements for Surgical Trainees (FIRST) trial. The article’s release coincides with the presentation of the trial’s results by the lead researcher, Dr. Karl Bilimoria from Northwestern University, at the 11th Annual Academic Surgical Congress (Association for Academic Surgery and Society of University Surgeons joint meeting) in Jacksonville, Fla.

In November 2015, Public Citizen and the American Medical Student Association (AMSA) filed formal complaint letters with the U.S. Office for Human Research Protections (OHRP) and the Accreditation Council for Graduate Medical Education (ACGME) about the FIRST trial — which involved thousands of general surgery residents — as well as the related ongoing unethical Individualized Comparative Effectiveness of Models Optimizing Patient Safety and Resident Education (iCOMPARE) trial — which involves thousands of internal medicine residents and is funded in part by the National Institutes of Health. The complaint letters detailed serious violations of core ethical principles and federal regulations related to the protection of human research subjects. 

The NEJM editors’ decision to publish the results of the unethical, seriously flawed FIRST trial violates the journal’s own policy requiring authors to provide assurances related to the protection of human subjects. Furthermore, as Public Citizen and AMSA predicted in their November complaint letter to OHRP, the trial yielded the self-serving results sought by the trial’s researchers, whose stated goal before the trial began was to roll back the ACGME’s 2011 mandatory limits on physician resident work hours that were adopted to protect both the residents and their patients from serious harm.

The NEJM’s longstanding policies for manuscripts reporting data from human subjects research require that authors include, “[i]f applicable, a statement that the research protocol was approved by the relevant institutional review boards [IRBs] or ethics committees and that all human participants gave written informed consent.” In an apparent attempt to satisfy this requirement, Dr. Bilimoria and his colleagues state in their article that the “FIRST Trial protocol was reviewed by the Northwestern University Institutional Review Board office and determined to be non-human-subjects research.” Importantly, it was only the manager of the Northwestern University IRB that made this determination, while the IRB itself did not review or approve the trial protocol.

However, even a cursory review of the FIRST trial’s methods described in the NEJM paper makes it immediately obvious that thousands of general surgery residents and their patients were human subjects of the research: Residents across the U.S. were randomly assigned, in groups by hospital, to either a “usual care” control group that complied with all current ACGME requirements, which include a work-shift cap of 16 consecutive hours for first-year residents; orto an experimental group with a less restrictive, “flexible” work-hour schedule, where first-year residents could have been forced to work significantly longer shifts (up to 28 or more hours) than permitted under current ACGME rules. The researchers then measured the rates of death and serious complications in patients cared for by the residents, as well as the residents’ perceptions of their work schedules, education, qualify of life and well-being. Significantly, the researchers convened a data safety monitoring board to monitor their trial, a procedure that would have been unnecessary if the research truly had not involved risks to human subjects.

The failures of the researchers to characterize their trial as one that involved human subjects research, to ensure that it had appropriate IRB review and approval, and to obtain the informed consent of the trial subjects represented egregious ethical lapses. But equally alarming is the NEJM editors and reviewers’ inexplicable acceptance of the researchers’ assertion that their trial did not involve human subjects research, despite incontrovertible evidence to the contrary. By violating their own policy, which is intended to ensure that all human subjects research published by the journal meets high ethical standards, the NEJM editors have signaled to the research community that the ethical lapses made by the FIRST trial researchers were acceptable, effectively encouraging similar lapses by future researchers.

Furthermore, because of serious flaws in the design of the FIRST trial, the results reported today offer little, if any, useful, valid information for assessing the effects of different resident work-hour schedules on the health and well-being of either the general surgery residents or their patients.

Before the FIRST trial began, the researchers openly declared that they expected to find no difference in patient deaths and serious complications between the two trial groups and would use such findings to advocate rescinding the ACGME’s 2011 more restrictive work-hour limits for all residents, but particularly those affecting first-year residents. As we explained in detail in our November letter to OHRP, the FIRST trial used a biased design that was highly likely to provide the researchers with their desired, predetermined outcome. For example, the work hours of only a minority of the patient care teams in each group (the first-year general surgery residents) could have differed significantly between the two trial groups. Also, the flexibility allowed under the protocol for residents in the experimental group combined with the likely frequent noncompliance with the 2011 work hour restrictions in the control group was another factor that minimized the differences between the control and experimental groups. The trial was further undermined by the researchers’ failure to measure how often residents in each study group actually adhered to their assigned work-hour schedules. Thus, today’s reported finding of no difference in patient deaths and serious complications between the two trial groups is both unsurprising and uninformative.

The FIRST trial researchers also failed to collect any meaningful, objective data on important resident health outcomes. For example, as discussed in our November letter, substantial evidence shows that sleep deprivation due to excessively long work shifts increases the risk of motor vehicle accidents, needle-stick and other injuries that can result in exposure to bloodborne pathogens, and depression in residents. But rather than systematically collecting data for each of these outcomes throughout the course of the trial — which would have been essential for monitoring resident safety during the trial — the researchers only asked residents at the approximately halfway point of the one-year trial to complete a single brief survey that included vague questions about their personal well-being and safety. As a result, the FIRST trial fails to add any useful information to what is already known about the adverse effects of sleep deprivation on resident health.

Likewise, the FIRST trial researchers failed to collect meaningful, objective data on resident education and training outcomes, but once again relied instead on questions from the same single survey to assess resident satisfaction and perceptions regarding the quality of their education and training.

Finally, the results of these resident surveys were susceptible to significant bias for several reasons. First, the survey questions were subjective. Second, the trial itself was unblinded. Third, the general surgery residents are indoctrinated into a culture that expects and encourages them to work excessively long hours. Fourth, the residents’ supervising physician mentors often openly oppose the ACGME’s 2011 mandatory limits on physician resident work hours. All of these factors made it more likely that the residents in the experimental group would express greater satisfaction with their assigned work hours than those in the control group.

See Public Citizen’s prior work regarding the FIRST trial.