Those arguing for and against work hours restrictions theoretically have access to the same body of literature, yet arrive at opposite conclusions. The basic logic of restricting hours is: experimental sleep deprivation impairs performance; long shifts increase the risk of sleep deprivation; sleep deprivation in residents leads to medical errors; shortening shifts will mitigate said deprivation; thus fewer medical errors will occur. As with many topics in medicine, what seems straightforward in the world of principle may not enjoy such luxury in the world of practice. Even in controlled experimental settings, the literature is strikingly mixed in evaluating the impact of sleep deprivation in the healthcare setting[2,3]. This not only reminds us of the complexity of studying this topic, but also sets the precarious stage for selective citing. Looking back to the landmark intensive care unit (ICU) experiments in 2004[4,5], widely cited as evidence supporting restricted work hours, provides clues to anchor ongoing discussions as we grapple with increasing concerns that the restrictions, seen by many as reasonable if not imperative, have in many ways failed to bear convincing fruits. Drilling deeper into this early work, not in criticism, but rather in inferential reflection, may provide context to reconciling the mixed literature and soften the polarizing rhetoric so that ongoing efforts to improve safety and performance will see measured discussion (e.g. ) that avoids over-simplifying a complex issue.
First, the study carefully quantified what is at stake in a high-risk ICU environment: 1.3 serious medical errors per 10 patient-days in usual care, which was reduced to 1.0 in the intervention group. Placing this difference into context is a challenging task, because where policy is concerned, the absolute risk is more important than the relative risk. For medical error rates, the statistically pertinent value is arguably not error per-patient but rather error per-decision. However, the required denominator for per-decision risk is not known: how many decisions are at-risk for serious error per patient-day in an ICU? If it is 100 (it could easily be so, across categories of exam, labs, medications, and diagnostics), then in 10 days, the base error rate would be ~0.1%. Similar values (and implications) exist for multi-center hand-off interventions to reduce errors. Whether one thinks a 0.1% error rate is unacceptably high is inferentially orthogonal to the question of rationalizing interventions to reduce already-small base error rates. We would be challenged to demonstrate a reduction from a base risk even as high as 1% in almost any medical field. Framed this way, it is perhaps not surprising that mixed results have emerged from an indirect intervention (work hours restrictions) to address a heterogeneous problem (sleep deprivation) in a complex system (residency) applied uniformly across diverse practice settings, coupled with individual heterogeneity of vulnerability to deprivation, and placed within a broader context that includes unintended consequences (handoffs, shift-culture, educational, etc).
Second, the randomization of that pioneering ICU study did not evenly distribute workload: the longer shift group happened to have ~30% more admissions and patient-days. This is a reminder that the inpatient world is hard to predict, and even prospective trials are vulnerable to ineffective randomization (as seen more recently, for example, in the controversial SERVE-HF trial). Workload heterogeneity, even within one center, reminds us of the challenges inherent in mandating hours-limits equally across specialties and hospitals with distinct operational ecosystems.
Third, while the study reported increased EEG-defined attention lapses in the usual work-hours group, over one-third of the intern participants did not show this effect, and no within-individual prediction was reported to allow association between EEG lapses and medical errors. This is a reminder of the long-recognized reality of inter-individual variability in sleep deprivation vulnerability, and raises the question of whether EEG lapses are in fact biomarkers for medical errors.
Fourth, crediting sleep for the reported lower rate of serious medical error rates under the shorter shift intervention is confounded by several other factors that differed between the groups (workloads, staffing, handoffs). Interestingly, to this point, the intervention group did not use much of their extra time for sleep (~45 minutes more per day), though actually this is consistent with other work. This reminds us of how challenging traditional evidence-based causal attributions remain in this arena of trainee performance.
One interpretation of the evolving work hours saga is that we need even stricter schedules, more rigorous tracking, and improved compliance[11,12]. An alternative approach is to query whether work hours restrictions are the best avenue to improve patient safety, even if we could solve the entrenched issues of increased cost, handoff risk, culture shift, and education paradigms. The mixed literature and results of the FIRST trial should not be viewed as a criticism of sleep research, or as proof that sleep is not important. A reasonable interpretation is that sleep is one of many factors impacting resident performance, which is arguably near its statistical ceiling, such that stratifying factors for pragmatic interventions that deliver measurable results is far from straightforward. That many factors are involved in patient safety during residency training was recognized in the Libby Zion tragedy (1984) that is widely cited as triggering residency training reform: work hours represented one of five contributing factors noted in trial. Bertrand Bell himself, of the Bell commission, lamented that trainee regulations following the Zion case emphasized work hours over the stated key factor of supervision.
For any complex issue, it is inevitable that published data and stakeholder opinions will be mixed. For this issue in particular, recognizing uncertainties is a key step grounding a discussion that struggles to reconcile experimental and real-world validation in the wake of the FIRST trial and the ACGME decision to relax duty hour requirements[14,15]. Ultimately, one can believe that sleep deprivation impacts performance, and that patient and physician safety are important, and yet still conclude that work hours restrictions may not be the best use of resources to reliably mitigate the attributable risk.
Contributed by: Dr Matt Bianchi
1. Bilimoria KY, Chung JW, Hedges LV, et al. National Cluster-Randomized Trial of Duty-Hour Flexibility in Surgical Training. The New England Journal of Medicine. Feb 25 2016;374(8):713-727.
2. Friedman WA. Resident duty hours in American neurosurgery. Neurosurgery. Apr 2004;54(4):925-931; discussion 931-923.
3. Philibert I, Nasca T, Brigham T, Shapiro J. Duty-hour limits and patient care and resident outcomes: can high-quality studies offer insight into complex relationships? Annual Review of Medicine. 2013;64:467-483.
4. Landrigan CP, Rothschild JM, Cronin JW, et al. Effect of reducing interns' work hours on serious medical errors in intensive care units. The New England Journal of Medicine. Oct 28 2004;351(18):1838-1848.
5. Lockley SW, Cronin JW, Evans EE, et al. Effect of reducing interns' weekly work hours on sleep and attentional failures. The New England Journal of Medicine. Oct 28 2004;351(18):1829-1837.
6. Shea JA, Willett LL, Borman KR, et al. Anticipated consequences of the 2011 duty hours standards: views of internal medicine and surgery program directors. Academic Medicine: Jul 2012;87(7):895-903.
7. Starmer AJ, Spector ND, Srivastava R, et al. Changes in medical errors after implementation of a handoff program. The New England Journal of Medicine. Nov 06 2014;371(19):1803-1812.
8. Cowie MR, Woehrle H, Wegscheider K, et al. Adaptive Servo-Ventilation for Central Sleep Apnea in Systolic Heart Failure. The New England journal of medicine. Sep 17 2015;373(12):1095-1105.
9. Van Dongen HP, Vitellaro KM, Dinges DF. Individual differences in adult human sleep and wakefulness: Leitmotif for a research agenda. Sleep. Apr 2005;28(4):479-496.
10. Baldwin DC, Jr., Daugherty SR. Sleep deprivation and fatigue in residency training: results of a national survey of first- and second-year residents. Sleep. Mar 15 2004;27(2):217-223.
11. AASM urges ACGME to limit resident work periods to 16 hours. 2016; http://www.aasmnet.org/articles.aspx?id=6647, 2017.
12. Volpp KG, Landrigan CP. Building physician work hour regulations from first principles and best evidence. JAMA : the journal of the American Medical Association. Sep 10 2008;300(10):1197-1199.
13. Bell BM. Resident duty hour reform and mortality in hospitalized patients. JAMA : the journal of the American Medical Association. Dec 26 2007;298(24):2865-2866; author reply 2866-2867.
14. AMSA and Public Citizen Send Complaint Letters Concerning FIRST and iCompare Trials. 2016; http://www.amsa.org/about/amsa-press-room/first-icompare-complaint-letters/, 2017.
15. Landrigan CP, Czeisler CA. A health-care change that could prove catastrophic. 2017; https://www.washingtonpost.com/opinions/a-health-care-change-that-could-prove-catastrophic/2017/02/22/2a4970d2-f30e-11e6-a9b0-ecee7ce475fc_story.html?utm_term=.aab3f9a29c45, 2017.
Disclosure: A version of this article was rejected from two major US medical journals in 2017.