

The Skeptics Guide to Emergency Medicine
Dr. Ken Milne
Meet ’em, greet ’em, treat ’em and street ’em
Episodes
Mentioned books

Mar 7, 2020 • 25min
SGEM#286: Behind the Mask – Does it need to be an N95 mask?
Date: March 4th, 2020
Reference: Radonovich et al. N95 Respirators vs Medical Masks for Preventing Influenza Among Health Care Personnel. A Randomized Clinical Trial. JAMA 2019 The Respiratory Protection Effectiveness Clinical Trial (ResPECT)
Guest Skeptics: Dr. Christopher Patey is an Assistant Professor with Memorial University Medical School in St. John’s, Newfoundland Canada. Over the past seventeen years he has practiced as a rural emergency and family physician and Clinical Chief of Emergency at Carbonear Hospital.
Paul Norman is a registered nurse working as a frontline emergency nurse in Eastern Health, Newfoundland, Canada. Paul has greater than ten years of experience working in Emergency Nursing and Critical Care. His focus is implementation of LEAN strategies, quality and process improvement. Paul's work has been extended to reach emergency services throughout Canada and he has contributed on many platforms including local, regional, provincial and national speaking engagements.
Disclaimers: This episode is about influenza not coronavirus (Covid-19)
Dr. Patey's Disclaimer: I am not an expert on PPE (Personal Protective Equipment), Influenza/HINI/Coronavirus, Journal Reviews or Emergency Department management of pandemics.
Paul Norman's Disclaimer: We (Dr. Patey and I) are experts on asking questions on the frontline of a Rural Emergency Department to ensure quality, and most importantly, effective patient care.
Dr. Ken Milne's Disclaimer: I am an expert on critical appraisal but do not know what mask (if any) is best for preventing the Covid-19 virus.
I think we can all agree on a few general recommendation: Get a flu shot if possible, wash your hands well (at least 20 seconds with soap and water), try not to touch your face, avoid people who are sick, stay home if you are feeling ill, cough into a tissue and throw it out immediately or cough into your elbow, disinfect objects or surfaces with a regular household cleaning wipe or spray, people who are well do not need to wear a facemask, people who are feeling ill should wear a facemask, and reach out to your local health authority if you think you might have the COVID-19.
Covid-19 Information:
This story is evolving quickly, and people should go to official websites to get the latest update on the Cover-19 situation:
Centre for Disease Control and Prevention
Health Canada
Public Health Ontario
World Health Organization
Food and Drug Administration
Case: With the potential global impact of the coronavirus (COVID-19) and our rural emergency departments (ED) having an extremely low compliance rate for N95 mask fit testing, our ED administration sends an urgent request for everyone to have N95 mask testing as soon as possible (ASAP). The urgent email also request shaving facial hair. You wonder about the evidence supporting the initiative and if there is any recent evidence surrounding N95 masks usage for preventing health care workers getting acute respiratory illnesses.
Background: Many hospitals had their health care workers fitted with N95 masks in response to the 2009 H1N1 pandemic. The N95 masks were known to prevent small particles and therefore thought to be more effective. What was not known is whether or not this better effectiveness would translate into less viral respiratory infections acquired in hospital compared to regular disposable surgical medical masks. In other words, would N95 masks have a healthcare provider-oriented outcome.
When it appeared that the transmission of the pandemic H1N1 was not different from seasonal influenza the recommendation for medical masks in most settings was reinstated.
With the potential for an epidemic/pandemic outbreak of coronovirus, there is the demand for increased vigilance in preventive measures to prevent and contain the outbreak of this communicable disease.
There have been a number of other studies discussing masks in preventing influenza spread:
Loeb et al 2009 did a non-inferiority trial of surgical masks vs. N95 respirator masks for preventing flu in Ontario nurses working at tertiary care hospitals. They concluded surgical masks were non-inferior.
MacIntyre et al 2009 did a cluster RCT on the use of face masks to control for respiratory virus transmission in households. They found face masks were unlikely to be an effective policy for seasonal respiratory diseases. This was in part because <50% of participants had mask adherence. Those who wore the mask did have a statistically significant reduction in clinical infection.
MacIntyre et al 2011 published another study in the same year comparing efficacy non-face masks to fit tested and non-fit tested N95 respiratory mask in preventing respiratory infections in hospital workers in China. The results showed a significant decrease in respiratory illnesses including influenza. The authors did cautioned readers that the trial may have been underpowered.
Smith et al CMAJ 2016 did a systematic review and meta-analysis on this topic. The authors concluded: "Although N95 respirators appeared to have a protective advantage over surgical masks in laboratory settings, our meta-analysis showed that there were insufficient data to determine definitively whether N95 respirators are superior to surgical masks in protecting health care workers against transmissible acute respiratory infections in clinical settings.”
Clinical Question: Are N95 masks superior in preventing flu or flu like illnesses in hospital workers compared to medical masks?
Reference: Radonovich et al. N95 Respirators vs Medical Masks for Preventing Influenza Among Health Care Personnel. A Randomized Clinical Trial. JAMA 2019 The Respiratory Protection Effectiveness Clinical Trial (ResPECT)
Population: Full-time hospital employees defined as providing at least 24hrs of direct patient care a week. Participants were instructed to wear their assigned protective devices during a 12-week period (intervention period) during which the incidence of viral respiratory illness was expected to be highest that year developed by the ALERT algorithm. This was for 48 weeks of intervention spanning four consecutive viral respiratory seasons.
Intervention: N95 respirator mask. Employees were told to wear their masks when six feet (two meters) from a person suspected or confirmed of having a respiratory illness.
Control: Medical mask
Outcomes:
Primary Outcome: Incidence of laboratory-confirmed influenza.
Secondary Outcomes: Incidence of acute respiratory illness, laboratory-detected respiratory infections, laboratory-confirmed respiratory illness, and influenza like illness. Adherence to interventions was also assessed.
Authors’ Conclusions: “Among outpatient healthcare personnel, N95 respirators vs medical masks as worn by participants in this trial resulted in no significant difference in the incidence of laboratory-confirmed influenza.
Quality Checklist for Randomized Clinical Trials:
The study population included or focused on those in the emergency department. No
The patients were adequately randomized. Yes
The randomization process was concealed. Yes
The patients were analyzed in the groups to which they were randomized. Yes
The study patients were recruited consecutively (i.e. no selection bias). Yes
The patients in both groups were similar with respect to prognostic factors. Yes
All participants (patients, clinicians, outcome assessors) were unaware of group allocation. No
All groups were treated equally except for the intervention. Yes
Follow-up was complete (i.e. at least 80% for both groups). Yes
All patient-important outcomes were considered. Unsure
The treatment effect was large enough and precise enough to be clinically significant. No
Key Results: This study was conducted at seven medical centers and 137 outpatient sites over four years (2011-2015) during the 3-month flu season. They enrolled 2,862 full time employees with a mean age of 43 years and 84% female. Nurses made up 41% of the cohort and less than 10% were physicians.
No statistical difference in the laboratory-confirmed influenza between an N95 mask and a medical mask.
Primary Outcome: Laboratory-confirmed influenza
8.2% N95 respirator group and 7.2% medical mask group (difference, 1.0%, [95% CI: −0.5% to 2.5%]; P = 0.18)
Adjusted odds ratio (OR) was 1.18 (95% CI: 0.95 to 1.45)
Secondary Outcomes: No statistical difference in any of the secondary outcomes using an intention-to-treat (ITT) or per-protocol (PP) analysis.
Self-reported wearing of the mask “always” or “sometimes” was about 90% in both groups.
Self-Reporting: Health care workers self-reported any illness. This could have resulted in under or over reporting of being sick. Adherence to mask use was also self-reported. Of those reporting, 90% said they wore the mask always or sometimes. However, almost one-third in each group did not even report adherence. This further limits the interpretation of the results.
Lack of Physicians: Less than 10% of the cohort were physicians. This means we have much less data on this group of individuals. I also suspect physicians were less likely to follow mask recommendations. Unfortunately, the supplemental material did not break down how many physicians were in the physician, physician trainees or advanced practitioners’ cohort.
Outside of Work: Participants were not required to use the masks outside of their work setting. Employees had to have at least 24 hours/week of direct patient care to be included in the study. However, more time would have been spent out of the hospital/clinic setting. These outside influences/exposures could have an impact on the results.
Patient-Oriented Outcome: This study was focused on the employees.

Feb 29, 2020 • 36min
SGEM#285: And I See Your True Colours Calming You – From your Anxiety
Date: February 28th, 2020
Reference: Rajendran et al. Randomised control trial of adult therapeutic colouring for the management of significant anxiety in the Emergency Department. AEM February 2020
Guest Skeptic: Dr. Corey Heitz is an emergency physician in Roanoke, Virginia. He is also the CME editor for Academic Emergency Medicine.
Case: One night during an overnight shift, you are taking care of a patient who presented to the emergency department (ED) due to anxiety and vague suicidal ideation. The process for medical clearance and psychiatric evaluation can take quite a while, and you notice that this patient seems stressed and anxious. You wonder if there’s a way to assist them during the prolonged wait without resorting to sedative medication.
Background: Psychological disorders are a common reason for presenting to the ED. Anxiety disorders are the most common (Marchesi et al EMJ 2004). However, we have only covered mental health issues a few times on the SGEM:
SGEM#45: Vitamin H (Haloperidol for Psychosis)
SGEM#178: Mindfulness – It’s not Better to Burnout than it is to Rust
SGEM#218: Excited Delirium Syndrome
SGEM#237: Screening Tool for Child Sex Trafficking
SGEM#252: Blue Monday- Screening Adult ED Patients for Risk of Future Suicidality
Patients with psychological disorders are often kept in the ED for a prolonged period of time. The ED itself can be a stressful environment and exacerbate anxiety.
Emergency physicians have pharmaceutical options to treat anxiety. One of the most common medications to use is a benzodiazepine like lorazepam or diazepam.
There is a need for non-pharmacological therapies to treat anxiety, and in some settings, art therapy has been studied. Specifically, adult coloring books have been used in the community and seem to function through cognitive easing (Rigby et al BMJ 2016 and Curry et al Art There 2005).
Clinical Question: Can colouring decrease anxiety in adult patients presenting to the emergency department?
Reference: Rajendran et al. Randomised control trial of adult therapeutic colouring for the management of significant anxiety in the Emergency Department. AEM February 2020
Population: Patients >15 years old with a score of >6 on the Hospital Anxiety and Depression Scale Anxiety (HADS-A). A score of >6 is considered moderate to severe anxiety.
Intervention: Colouring pack (10 adult colouring pages and 36 pencil colours)
Comparison: Placebo pack (10 plain sheets of paper, a Bic pen and instructions to draw or write freely)
Outcome:
Primary Outcome: Within-patient change in HADS-A score from baseline after two hours of therapy.
Secondary Outcomes: Survey questions regarding value of therapy and level of engagement with treatment packs (length of time)
Dr. Naveen Rajendran
This is an SGEMHOP episode which means we have the lead author on the show. Dr. Naveen Rajendran is an intern at the Westmead Hospital in Sydney with a keen interest in emergency medicine and the investigation of novel therapies that could aid in alleviating the growing stress on modern emergency departments. This study was conducted when he was a medical student at the University of Sydney with Dr. Coggins (@coggi33) who was his research supervisor.
Authors’ Conclusions: “Among ED patients, exposure to adult colouring books resulted in lower self-reported levels of anxiety at 2-hours compared to placebo.”
Quality Checklist for Randomized Clinical Trials:
The study population included or focused on those in the emergency department. Yes
The patients were adequately randomized. Unsure
The randomization process was concealed. Yes
The patients were analyzed in the groups to which they were randomized. Yes
The study patients were recruited consecutively (i.e. no selection bias). Unsure
The patients in both groups were similar with respect to prognostic factors. Yes
All participants (patients, clinicians, outcome assessors) were unaware of group allocation. No
All groups were treated equally except for the intervention. Yes
Follow-up was complete (i.e. at least 80% for both groups). Yes
All patient-important outcomes were considered. Yes
The treatment effect was large enough and precise enough to be clinically significant. Yes
Key Results: They screened 179 patients that were flagged as being anxious. The cohort included 53 participants with a mean age of 33 years and 73% were female.
HADS-A decreased significantly more in the adult colouring group
Primary Outcome:
Intervention Group: Mean HADS-A decrease at two hours was 3.7 (95%CI 2.4 to 5.1, p<0.001)
Control Group: Mean HADS-A decrease at two hours: 0.3 (95%CI -0.6 to 1.2, p=0.51)
Secondary Outcomes:
For the question "would you recommend colouring" on a Likert Scale (1-5) the average satisfaction score was 4.2.
We asked Naveen ten questions to get a greater understand of his publication. Listen to the SGEMHOP podcast to hear all of his answers.
Single Centre: This was a relatively small sample size of 53 patients. However, you did recruit enough to meet your power calculation of 48 participants to find a 2.5-point decrease with 80% power. We were more concerned that this was conducted in a single center and raises question of external validity to other populations.
Consecutive Patients: We are unsure if this was a consecutive sample. The methods section says; “all patients in the ED were potentially eligible for the study.” However, patients needed to be flagged by residents, consultants, triage nurses or social workers as being “anxious”. People have unconscious biases and this method could have introduced some selection bias. Why not just ask patients if they were feeling anxious and then ask them to be included in the trial?
Exclusions: A significant number of patients were excluded after initial screening. Can you discuss how this might affect real-world utility of something like this?
Lack of Blinding: The patients would know if they were in the colouring pack vs. placebo pack. Could this have impacted the results?
Blinding to Hypothesis: Were the patients, clinicians, and outcome assessors blinded to the research hypothesis?
HADS-A Scoring: The HADS-A has been validated in various languages and groups of patients. You say this anxiety scoring system has been validated in the ED setting. We pulled that study and it was done in Saudi Arabia (Al Aseri et al BMC Emerg Med 2015). Has it been validated in any other countries like the USA or Canada?
Placebo Control: There is a difference between a placebo control and an active control. Can you discuss how your placebo control group is a true placebo? It seemed to us more like an active control group. How is the activity such as coloring so different from having a pen and paper and being told to occupy yourself with them?
Medication: You compared the colouring activity to the placebo pack (Bic Pen, plain paper and encouragement to draw). Why not comparing it to usual care such as a benzodiazepine?
Magnitude of Effect: The intervention decreased the HADS-A score by 3.4 more than the control. While it was statistically significant is this observed decrease clinically significant.
Duration of Effect: Your primary outcome was at two hours. Did you measure any anxiety outcomes after the activity has ended? Do we know how long it takes someone to return to a high anxiety level once art therapy is removed?
Conflicts of Interest: Did you receive any funding or support from the adult colouring book industry?
Comment on Authors’ Conclusion Compared to SGEM Conclusion: We agree with the authors’ conclusions
SGEM Bottom Line: Art therapy in the form of coloring may be a useful non-pharmacologic alternative treatment for ED patients with anxiety.
Case Resolution: You provide your patient with an adult coloring book and coloring pencils. Two hours later, they seem calmer, and their ED visit is almost over. They thank you for providing them something to ease their mind during their stay.
Dr. Corey Heitz
Clinical Application: Adult coloring books are a low risk and potentially rewarding non-pharmacologic way to treat anxiety in the ED.
What Do I Tell the Patient? You seem anxious, and this visit may take some time. Some people have found that being able to spend some time colouring can help them cope with the stress of an ED visit. Would you like some supplies and try doing some colouring?
Keener Kontest: Last weeks’ winner was Jonathan Godfrey. He knew PARACHUTE stood for: "PArticipation in RAndomized trials Compromised by widely Held beliefs aboUt lack of Treatment Equipoise".
Listen to the SGEM podcast to hear this weeks’ question. Send your answer to TheSGEM@gmail.com with “keener” in the subject line. The first correct answer will receive a cool skeptical prize.
SGEMHOP: Now it is your turn SGEMers. What do you think about using adult colouring books to deal with patients’ anxiety in the ED? Tweet your comments using #SGEMHOP. What questions do you have for Naveen and Andrew and his team? Ask them on the SGEM blog. The best social media feedback will be published in AEM.
Also, don’t forget those of you who are subscribers to Academic Emergency Medicine can head over to the AEM home page to get CME credit for this podcast and article. We will put the process on the SGEM blog:
Go to the Wiley Health Learningwebsite
Register and create a log in
Search for Academic Emergency Medicine – “February”
Complete the five questions and submit your answers
Please email Corey (coreyheitzmd@gmail.com) with any questions or difficulties.
Remember to be skeptical of anything you learn, even if you heard it on the Skeptics’ Guide to Emergency Medicine.

Feb 22, 2020 • 39min
SGEM Xtra: Right, You’re Bloody Well Right, You’ve got the Bloody Right to Care
Date: January 27th, 2020
Guest Skeptics: Dr. Richelle Cooper is a Professor of Emergency Medicine at the UCLA Department of Emergency Medicine. Dr. Maia Dorsett is an Emergency and EMS Physician at the University of Rochester Medical Center.
Reference: Dorsett et al. Bringing value, balance and humanity to the emergency department: The Right Care Top 10 for emergency medicine. Emerg Med J 2019
This is an SGEM Xtra based on a recent publication by Dr. Dorsett and her team. It is an article of ten recommendations on how we might provide a more balanced approach to healthcare tailored to the needs of the patients we see in the emergency department. One of the authors of the article was the Legend of Emergency Medicine, Dr. J. Hoffman.
SGEMers have heard about over-testing, over-diagnosing and over-treating. These authors have some concerns about what they call the unmentioned "elephant in the room".
"While specialty societies do undertake advocacy work to address the health needs of the public, they also have a fundamental duty to advocate for and protect the interests of their specialty. Furthermore, healthcare dollars that are ‘wasted’ are of course not actually thrown away but rather end up in someone’s pocket; thus, there is clearly a conflict of interest when specialty societies address the overuse of extremely lucrative medical procedures that provide substantial income to their members."
Choosing Wisely is an initiative trying to address the issue of over-testing, over-diagnosing and over-treating. To be clear, these authors are not against Choosing Wisely.
"Important to note that we are not against choosing wisely, however the issue is larger and more nuanced. It is not just about “low value” care and costs but about harms, harms from overuse of diagnostic tests and treatment and also from underuse in other cases. The right care alliance is concerned about the right care for the right patients at the right time, thus not just overused tests."
The organization this group of authors are associated with is called the Right Care Alliance (RCA). How is it different from the Choosing Wisely Campaign?
"The Right Care Alliance was formed in 2015 by the Lown Institute, a healthcare think tank. Many of us, such as myself, became involved with the work of the Lown because of our interest in reducing the harms of overtesting and overdiagnosis. But we quickly realized that talking about Right Care was actually a conversation about the Right amount of care and that this was more than just about too much care, it was also about underuse, health care access and a focus on treating the whole patient. It was this realization – that we cannot address overuse without talking about underuse - that lead to the formation of the RCA. The powerful part of the RCA is that it is a grassroots coalition of not just healthcare practitioners, but also patients and community members."
Where does emergency medicine fit into the RCA initiative?
"Nowhere in healthcare is the unfortunate dichotomy between overuse and underuse as apparent as in our emergency departments, which function simultaneously as centers of high acuity healthcare and healthcare safety nets. Organizationally, the RCA has a number of subcommittees or “councils”. The Emergency Medicine (EM) Council is one of these subgroups and is composed primarily of emergency physicians and nurses."
"In May 2016, the RCA asked its specialty councils to create their own ‘top 10’ lists, The goal was to identify not merely interventions that are overused but also others that need to be used more widely, if we are to achieve both better and more equitable health outcomes and financial savings."
What were the guiding principles put forward by the RCA to generate the top 10 list?
Guiding Principles for Top 10 List:
Patient-centred
Holistic in approach
Understandable to both healthcare professionals and non-health care professionals
Meaningful to everyone who participates in the healthcare system
Criteria Used to Select the Top 10 Items:
Matter to patients
Have high potential to harm or to benefit
Be common (overuse) or rare (underuse) enough that avoiding or doing the item routinely would move the needle towards the right care
Examine or illustrate how it ties to system failures.
The committee was predominantly made up of emergency physicians, including residents, faculty and community physicians, and emergency medicine nurses.
Patients were invited to participate on all the committees, and it was required that members of the Patient council review and provide input to all lists.
The Emergency Medicine (EM) members of the RCA were all invited to participate, ultimately 125 gave input on potential items. They participated in each part of the scoring and ranking and in a smaller group for the discussion of the items. Similarly, Maia presented and received input from patients/patient advocates at a Lown conference.
Two Overriding Principles of the EM Right Care Top 10 List:
"The quixotic search for certainty’ describes the all too common attempt by clinicians to find the last few patients who may be in danger even though an evaluation has shown that risk is minimal. Along with this fear of missing even a single patient with a serious problem, most clinicians have been taught to believe (incorrectly) that ‘tests’ are more ‘objective’ than clinical judgement and, thus, that doing more is ‘safer’ and more ‘evidence based".
"Medical care is not the sole, or even the most important, determinant of health outcomes. Social determinants—including, but not limited to, food insecurity, homelessness and addiction—are profoundly important to the health of a great many patients. These issues must be addressed as part of the larger healthcare system, but it is also critical that ED clinicians pay attention to and address social factors in their patients, individual by individual".
EM Right Care Top 10 List:
Listen to the SGEM podcast to hear Dr. Dorsett and Cooper expand on each of these items.
Avoid further testing beyond history, physical exam, clinical gestalt and ECG in patients who are at minimal risk of an acute coronary syndrome (ACS).
Avoid further testing beyond history, physical exam and clinical gestalt in patients who are at minimal risk of pulmonary embolus (PE).
responds
Be judicious with the use of imaging, especially advanced imaging, in trauma patients.
Avoid routine laboratory testing.
Consider non-medical reasons for a patient’s presentation to the ED.
Tailor the intensity of care to the goals of the patient.
Employ shared decision-making (SDM) where appropriate.
When prescribing an intervention, make an effort to ensure that the patient is capable of accomplishing what is recommended.
Tailor discharge instructions and follow-up recommendations to the individual patient.
Be an advocate.
Dr. Cooper
Conclusion: "The RCA is working to change the conversation about American healthcare, advocating for access for all individuals to high-quality care without financial hardship, eliminating overuse and underuse, and championing the partnership between the patient and clinician. The EM Council’s top 10 list seeks to serve as a starting point to focus ED clinicians in achieving the goals of the RCA. While other lists exist, and we agree with many Choosing Wisely areas of focus, we seek to move the needle even further. In what is ultimately an impossible attempt never to miss a single case with a life-threatening diagnosis, we paradoxically cause a great deal of harm to the overall population through over-testing and contribute to the untenable rising cost of healthcare."
Dr. Dorsett
"When we fail to spend the time needed to understand the context of our patients’ lives outside of the ED, we miss the opportunity to improve the patient’s health. While some problems are big and may take decades to fix, micro-changes in our daily practice— listening more, ordering more thoughtfully—are possible today. One patient at a time, one shift at a time, one ED, one hospital and one community at a time, we as clinicians need to help drive the change. We do not need more research to show unnecessary testing is occurring; we need effective means to implement change and support clinicians in putting the best interests of their patients first."
The SGEM will be back next episode doing a structured critical appraisal of a recent publication. Trying to cut the knowledge translation window down from over 10 years to less than 1 year using the power of social media. So, patients get the best care, based on the best evidence.
Remember to be skeptical of anything you learn, even if you heard it on the Skeptics’ Guide to Emergency Medicine.

Feb 15, 2020 • 20min
SGEM#284: Might as Well Jump, but We would Recommend a Parachute
Date: February 11th, 2020
Reference: Yeh et al. Parachute use to prevent death and major trauma when jumping from aircraft: randomized controlled trial. BMJ 2018.
Guest Skeptic: Marcus Prescott is a nurse in Norway. He is also now a third-year medical student.
Case: A 32-year-old woman with no previous medical history calls you while a passenger on a crashing plane. She has been offered a parachute by the flight attendant but is unsure whether jumping from the plane is wise. You quickly scour the literature for evidence to inform her decision.
Background: The parachute– an umbrella term for devices to slow the motion of an object through an atmosphere by creating drag – was first deployed in China roughly 4,000 years age. The modern versions reached widespread use with the invention of heavier than air flight early last century.
Different variants of parachutes have been used both for recreational and safety purposes; in either case aiming to avoid death in people falling from heights presumed to be lethal. Despite the near universal application, a systematic review from 2003 (Smith and Pell, BMJ) found no RCTs of parachute intervention.
That systematic review published in the BMJ is a classic paper and part of their annual holiday edition. It stated that there was observational data showing parachutes failed at times to prevent morbidity and mortality. There are also case reports of free falls that did not result in 100% mortality.
The authors suggested taking evidence-based medicine advocates up in a plane for a double blinded randomized control trial. The intervention would be a parachute and the control arm would be a sham parachute (backpack). To make it more rigorous, anyone who survived the first jump would cross over into the other arm of the study and jump again. Only then would we have definitive evidence that a parachute was effective in preventing death and major trauma related to gravitational challenges.
After years of trying to organize a trial, researchers were finally able to recruit some volunteers to jump out of a plane with a parachute or backpack.
Clinical Question: Do parachutes reduce death or major injury when jumping from aircraft?
Reference: Yeh et al. Parachute use to prevent death and major trauma when jumping from aircraft: randomized controlled trial. BMJ 2018.
Population: Adults 18 years of age and older, seated on aircraft and deemed rational decision makers.
Intervention: Jumping from aircraft with parachute
Comparison: Jumping from aircraft with backpack
Outcome:
Primary Outcome: Composite of death and major traumatic injury (ISS>15) within five minutes of impact or at 30 days.
Secondary Outcomes: Health status and subgroup analysis based on type of aircraft or previous parachute use.
Authors’ Conclusions: “Parachute use did not significantly reduce death or major injury when jumping from aircraft in the first randomized evaluation of this intervention. However, the trial was only able to enroll participants on small stationary aircraft on the ground, suggestion cautious extrapolation to high altitude jumps. When beliefs regarding the effectiveness of an intervention exists in the community, randomized trials might selectively enroll individuals with a lower perceived likelihood of benefit, thus diminishing the applicability of the results to clinical practice.”
Quality Checklist for Randomized Clinical Trials:
The study population included or focused on those in the emergency department. No
The patients were adequately randomized. Yes
The randomization process was concealed. Yes
The patients were analyzed in the groups to which they were randomized. Yes
The study patients were recruited consecutively (i.e. no selection bias). No
The patients in both groups were similar with respect to prognostic factors. Unsure
All participants (patients, clinicians, outcome assessors) were unaware of group allocation. No
All groups were treated equally except for the intervention. Yes
Follow-up was complete (i.e. at least 80% for both groups). Yes
All patient-important outcomes were considered. Yes
The treatment effect was large enough and precise enough to be clinically significant. No
Key Results: They screened 92 adults with only 23 agreeing to be in the trial. The median age was 38 years and 43% were female.
Parachutes did not reduce death or major injury
Primary Outcome:
Composite of death and major traumatic injury (ISS>15) within five minutes of impact was 0% vs. 0% with p>0.9
Composite of death and major traumatic injury (ISS>15) within 30 days was 0% vs. 0% with p>0.9
Secondary Outcomes:
No statistical difference in health status
No statistical differences when stratified by type of aircraft or previous parachute use.
Talk Nerdy: There were many limitations to this study including a composite outcome for the primary outcome. However, we will only discuss five things that threaten the validity and interpretation of this trial.
Convenience Sample: These were not consecutive adults sitting on an airplane. Participants were selected from those seated next to the recruiter. This could have introduced some selection bias into the study population. When we use the term “bias” we are not talking about random noise in the data but rather something that systematically moves us away from the true point estimate.
Lack of Blinding: Allocation to parachute or backpack was not concealed to the investigator who assigned the treatment. This too could have led to some selection bias. The groups were unbalanced with more frequent fliers in the control (backpack) group. This may or may not have impacted the results.
Ikea Bias: Most of the participants who were randomized were study investigators. They would be unblinded to the study hypothesis and could be more invested in the results because they helped design the study. Whether or not this would have a significant impact on the results is unclear.
Lack of Deployment: In the intervention arm none of the12 participants had their parachute open. This makes the trial very difficult to interpret. If the parachute did deploy properly would it have provided a benefit? However, none of the 12 participants died or were injured because the parachute did not open during the jump.
Fatal Flaw: There was a difference between participants and non-participants. Participants jumped from a mean altitude of 0.6m traveling at a velocity of 0km/hr. This is in comparison to the non-participants who were at a mean altitude of 9,000m and traveling at a velocity of 800km/hr.
Comment on Authors’ Conclusion Compared to SGEM Conclusion: We generally agree with the authors’ conclusions.
SGEM Bottom Line: Wear a parachute if jumping out of a moving aircraft in the air to prevent morbidity and mortality.
Case Resolution: Despite the lack of high-quality evidence demonstrating the efficacy of parachutes, you advise your friend to use the parachute being offered by the flight attendant.
Marcus Prescott
Clinical Application: Based on your understanding of physics and reality, you would recommend people use parachutes if jumping out of an aircraft that is flying. While it does not guarantee you will not be injured or die it is the best evidence we have on the topic. In addition, more research is not needed to determine if parachutes prevent morbidity or mortality due to gravitational challenges.
What Do I Tell the Passenger? Accept the parachute being provided by the flight attendant.
Keener Kontest: Last weeks’ winner was Jonathan Carter. He knew Kingston was the first capital of Canada.
Listen to the podcast to hear this weeks’ question. Send your answer to TheSGEM@gmail.com with “keener” in the subject line. The first correct answer will receive a cool skeptical prize.
Other FOAMed:
Hayes et al. Most medical practices are not parachutes: a citation analysis of practices felt by biomedical authors to be analogous to parachutes. CMAJ 2018
Potts and Grossman. Parachute approach to evidence based medicine. BMJ 2006
Mamas. What a Parachute Study Tells Us About RCTs. Medscape 2018
First10EM: Finally, an RCT of parachutes
Remember to be skeptical of anything you learn, even if you heard it on the Skeptics’ Guide to Emergency Medicine.

Feb 8, 2020 • 24min
SGEM#283: Can You Be Absolutely Right in Diagnosing a SAH Using a Clinical Decision Instrument?
Date: January 29th, 2020
Reference: Perry et al. Prospective Implementation of the Ottawa Subarachnoid Hemorrhage Rule and 6-Hour Computed Tomography Rule. Stroke 2019
Guest Skeptic: Dr. Rory Spiegel is an EM/CC doctor who splits his time in the Emergency Department and Critical Care department. He also has this amazing #FOAMed blog called EM Nerd.
Case: A 48-year-old male presents to your emergency department with a sudden onset headache, which started about one-hour prior to arrival. The headache is severe is quality and the patient does not have a history of similar headaches in the past. It is associated with nausea, vomiting and photophobia.
Background: Headaches are a common complaint presenting to the emergency department. Subarachnoid hemorrhage represents one of the most serious underlying causes of headaches and we have covered it a number of times on the SGEM:
SGEM#48: Thunderstruck – Subarachnoid Hemorrhage
SGEM#134: Listen, to what the British Doctors Say about LPs post CT for SAH
SGEM#140: CT Scans to Rule Out Subarachnoid Hemorrhages in A Non-Academic Setting
SGEM#201: It’s in the Way That You Use It – Ottawa SAH Tool
In patients who present neurologically intact making the diagnosis early is key to preventing subsequent more life-threatening bleeding. A number of controversies surround the diagnosis of SAH in the emergency department. Two of the more provocative are the use of the Ottawa SAH Rule and whether a lumbar puncture (LP) is required following a negative CT if the scan is performed within 6-hours of symptom onset.
The Ottawa SAH Rule (tool) was covered on SGEM#201. The bottom line from that study was that the clinical decision instrument needed external validation, a meaningful impact analysis performed and patient acceptability of incorporating this rule into a shared decision-making instrument before being widely adopted.
We were surprised that in their background/introduction material they did not include the excellent SRMA on this topic by Carpenter et al. AEM 2016.
Clinical Question: What is the clinical impact of the Ottawa SAH Rule and the 6-hour CT Rule compared to standard care when implemented in six emergency departments across Canada?
Reference: Perry et al. Prospective Implementation of the Ottawa Subarachnoid Hemorrhage Rule and 6-Hour Computed Tomography Rule. Stroke 2019
The senior author on this publication was the legend of emergency medicine, Dr. Ian Stiell from Ottawa.
Population: Neurologically intact adult presenting to the ED with a chief complaint of a nontraumatic, acute headache, or syncope associated with a headache.
Exclusions: Patients with any of the following:
3 or more previous similar headaches (ie, same intensity/character as their current headache) over a period of >6 months (eg, established migraines)
confirmed SAH before arrival at study ED
previously investigated with CT and LP for the same headache
papilledema
new focal neurological deficit
previous diagnosis of intracranial aneurysm or SAH
known brain neoplasm
cerebroventricular shunt
headache within 72 hours following a LP
headache described as gradual or peak intensity beyond 1 hour.
Intervention: Physicians were actively encouraged to use the Ottawa SAH Rule and the 6-hour-CT Ruleto determine when to undergoing diagnostic workups for SAH and when a CT alone with an appropriate workup. Clinicians had the option to override the proposed rules.
Comparison: The control phase was standard care. Clinicians were encouraged to not use any clinical decision instrument and make the decision to pursue diagnostic studies based on their own clinical discretion.
Outcome: The primary outcome was the clinical impact of the Ottawa SAH Rule and 6-hr CT Rule for making the diagnosis of a SAH compared to usual care. SAH was defined as:
Subarachnoid blood on CT
Xanthochromia in the cerebrospinal fluid
Red blood cells in the final tube of cerebrospinal fluid with an aneurysm demonstrated on cerebral angiography, CTA, or magnetic resonance imaging angiography.
Dr. Jeff Perry
Authors’ Conclusions: “This implementation study validates the accuracy of the Ottawa SAH rule and 6-hour-CT rule for SAH. Both the Ottawa SAH rule and the 6-hour-CT rule are now fully validated and ready to use clinically. Using the Ottawa SAH rule did not increase or decrease the number of investigations performed. The 6-hour-CT rule resulted in a modest decrease in testing following a normal early CT. Utilizing the Ottawa SAH rule and the 6-hour-CT rule allows clinicians in ED to safely standardize care for alert, patients with acute headache.”
Quality Checklist for A Diagnostic Study:
The clinical problem is well defined. Yes
The study population represents the target population that would normally be tested for the condition (ie no spectrum bias). Yes
The study population included or focused on those in the emergency department. Yes
The study patients were recruited consecutively (ie no selection bias). Yes
The diagnostic evaluation was sufficiently comprehensive and applied equally to all patients (ie no evidence of verification bias). No
All diagnostic criteria were explicit, valid and reproducible (ie no incorporation bias) No
The reference standard was appropriate (ie no imperfect gold-standard bias). No
All undiagnosed patients underwent sufficiently long and comprehensive follow-up (ie no double gold-standard bias). Unsure
The likelihood ratio(s) of the test(s) in question is presented or can be calculated from the information provided. Yes
The precision of the measure of diagnostic performance is satisfactory. Yes
Key Results: They had 3,672 patient that met inclusion criteria. There were 1,743 patients in the control phase of the study and 1,929 patients in the implementation phase of the study when. The mean age was 45 years and 60% were female. They identified 188 (5.1%) of patients had a SAH.
Ottawa SAH Rule:
Sensitivity 100% (95% CI 98.1% to 100%)
Specificity 12.7% (95% CI: 11.7% to 13.9%)
6hr CT Rule:
Sensitivity 95% (95% CI 89.8% to 98.5%)
Specificity 100% (95% CI: 99.7% to 100%)
1. Patient Population: This was a pretty wide group of patients which were considered for this study. A rule like Ottawa SAH Rule where the specificity is so low you would ideally like to apply it in a population at high risk for the disease state. So, in patients in whom I am already considering a workup for SAH and if the Ottawa SAH Rule is negative, I can stop the work up. This would be similar to the PERC rule. Applying the Ottawa SAH Rule in a more generalized group of patients may lead to an increase in downstream testing.
In contrast this may have helped the 6-hr CT Rule as not a lot of these patients (5%) ended up having a SAH. Now it did go up to 9% when only the subset of patients presenting within 6-hrs of symptom onset where included.
2. Gold Standard: The gold standard here is a bit complicated. Ideally what you would like is a measure the accurately diagnoses SAH and it would be preferable if the investigators used this same measure on all patients included in the study. But that is not always practical in real world studies. So, in this case you would ideally like if everyone received an LP and then some form of angiography to assess for aneurysm if the LP was positive. Obviously, it’s impractical and ethically questionable to perform an LP and angiography on all the patients in this study so the authors had to use different gold standards depending on what was found on the initial CT scan. This can lead to a number if forms of bias.
Incorporation bias occurs when results of the test under study are actually used to make the final diagnosis. This makes the test appear more powerful by falsely raising the sensitivity and specificity.
In this case, subarachnoid blood seen on the CT scan was included in the gold standard definition of SAH. Obviously, this will make the specificity of the CT scan appear really good and, in this case, it was 100%
Partial verification bias is a type of measurement bias in which the results of a diagnostic test affect whether the gold standardprocedure is used to verify the test result. This type of bias is also known as "work-up bias"or "referral bias”.
In this case, patients with a negative CT did not always undergo an LP. Since not all patients underwent the gold standard testing this can influence the diagnostic accurate of the test in question. In this case the 6-hr CT may appear more accurate than it is reality because if some SAH are missed on CT and having not undergone the LP there is the potential they will be counted as a true negative result.
3. Proxy Outcome Measure: In cases when a consistent gold standard cannot be used on all subjects a proxy measure can be used in its place. In this case the authors used the proxy outcome of alive and well at 6-months as a surrogate as not having an SAH. This seems like a reasonable surrogate. If you had a headache and did not receive any intervention for an aneurysm and did not have a SAH the likelihood that your initial headache was a herald bleed is minimal.
This is known as differential verification bias (double gold standard). This occurs when the test results influence the choice of the reference standard. So, a positive index test gets an immediate/gold standard test whereas the patients with a negative index test get clinical follow-up for disease. This can raise or lower sensitivity/specificity.
The question is what is an adequate definition of not having a SAH on 6-month follow up? The authors used a review the medical records of the hospital which they initially presented as well as every hospital with neurosurgical capacity in the same city as the index ED visit. Is this adequate follow up?

Jan 29, 2020 • 3min
SGEM#281ss: Balance of Prognostic Factors in Randomized Controlled Trials
Date: January 25th, 2020
SGEM#281: EM Docs Got an AmbuBag
Statistically Significant: Dan Lane
We want to make the SGEM even better and address some of the criticisms from the ClinEpi world about clinicians trying to do critical appraisal. In order to do that we now have a Dr. Dan Lane who has a PhD in Clinical Epidemiology. He will be commenting on each the SGEM episodes.
Dr. Dan Lane
On this episode of Statistically Significant we are going to discuss the importance of balance of prognostic factors in randomized controlled trials, using the PreVent trial as an example.
Characteristics that indicate when a patient more likely to have an outcome, what we call prognostic factors, need to be accounted for when assessing the effectiveness of a treatment. Without accounting for prognostic factors, the measures of treatment effect can be biased due to observed or unobserved factors amongst patients in each group. Consider if this same study had been conducted as a non-randomized design –clinicians may have decided to ventilate select patients between induction and intubation because they perceived them as more unstable prior to induction. These patients may also be at higher risk for hypoxia during this period for the same reasons the clinicians chose to ventilate them and therefore they would look worse when compared to patients not receiving ventilation if you did not account for these reasons – this is what epidemiologists call an indication bias.
The goal of randomization in clinical trials is to balance patient characteristics between the different groups being investigated in the study. By randomly assigning patients to groups, the sole indication for receiving the treatment is the randomization process. As long as there are enough patients randomized, all known and unknown prognostic factors will be mathematically balanced between the groups. Therefore when talking about the balance of prognostic factors as part of critical appraisal, the key point to realize is there are both known and unknown factors. Although in this study they found some statistical differences between measured prognostic factors at baseline, these are just the prognostic factors that happen to be reported by the investigators. If we trust their randomization process then we can assume that the overall risk of the primary outcome, which includes measured and unmeasured prognostic factors, is mathematically balanced between the groups.
One final point - the use of statistical hypothesis testing to compare prognostic factors is actually inappropriate here because by definition the null hypothesis that the two groups are the same is assumed to be true when the two groups are selected based on randomization. Therefore, any differences between the groups would be due to chance alone and considering them different would be a type 1 error.
Additional Reading:
Altman and Bland. Treatment allocation in controlled trials: why randomise? BMJ May 1999
Sander Greeland. Randomization, statistics, and causal inference. Epidemiology Nov 1990
Stephen Sean. Baseline Balance and Valid Statistical Analyses: Common Misunderstandings. Applied Clinical Trials. May 2005.
REMEMBER TO BE SKEPTICAL OF ANYTHING YOU LEARN, EVEN IF YOU HEARD IT ON THE SKEPTICS’ GUIDE TO EMERGENCY MEDICINE.

Jan 25, 2020 • 15min
SGEM#281: EM Docs Got an AmbuBag – The PreVent Trial
Date: January 9th, 2020
Reference: Casey et al. Bag-Mask Ventilation during Tracheal Intubation of Critically Ill Adults. NEJM February 2019
Guest Skeptic: Andrew Merelman is a critical care paramedic and second year medical student at Rocky Vista University in Colorado. His primary interests are resuscitation, critical care, airway management, and point-of-care ultrasound.
Case: A 60-year-old male is in your emergency department with sepsis from pneumonia. He has worsening work of breathing and a decreasing level of consciousness. You decide based on his clinical presentation that he needs to be intubated. Due to his already poor oxygenation, you are concerned about him desaturating during intubation and wonder if there is anything you can do to help prevent it.
Background: Emergency medicine is often referred to as the ABC (Airway, Breathing and Circulation) specialty. We have covered airway a few times on the SGEM:
SGEM#75: Video Killed Direct Laryngoscopy?
SGEM#96: Machine Head – NIPPV for Out of Hospital Respiratory Distress
SGEM#247:Supraglottic Airways Gonna Save You for an OHCA?
SGEM#249: Ace in the Hole – Confirming Endotracheal Tube Placement with POCUS
SGEM#271: Bougie Wonderland for First Pass Success
Rapid Sequence Intubation (RSI) has been a mainstay of emergency airway management for years. However, there are aspects of the procedure that have been debated, one of which is how best to oxygenate the patient during the apneic period while not increasing rates of aspiration.
Clinical Question: Is bag-mask ventilation (BMV) performed during the apneic period of RSI (defined as the time between administration of RSI medications and intubation) in critically ill adults safe and effective?
Reference: Casey et al. Bag-Mask Ventilation during Tracheal Intubation of Critically Ill Adults. NEJM February 2019
Population: Adults patients (older than 17 years of age) undergoing induction and tracheal intubation in the intensive care unit.
Exclusions: Patients who were pregnant, incarcerated, had immediate need for intubation or if the treating clinicians felt that ventilation was indicated or contraindicated between induction and laryngoscopy.
Intervention: Bag-mask ventilation (BMV) during the time between administration of sedation/paralysis and insertion of the laryngoscope into the mouth for intubation.
Comparison: Apnea with or without nasal cannula oxygen during the time between administration of sedation/paralysis and insertion of the laryngoscope into the mouth for intubation.
Outcome:
Primary Outcome: The lowest oxygen saturation observed during the interval between induction and two minutes after tracheal intubation.
Secondary Outcome: The incidence of severe hypoxemia (oxygen saturation of less than 80%).
Authors’ Conclusions: “Among critically ill adults undergoing tracheal intubation, patients receiving bag-mask ventilation had higher oxygen saturations and a lower incidence of severe hypoxemia than those receiving no ventilation.”
Quality Checklist for Randomized Clinical Trials:
The study population included or focused on those in the emergency department. No
The patients were adequately randomized. Yes
The randomization process was concealed. Yes
The patients were analyzed in the groups to which they were randomized. Yes
The study patients were recruited consecutively (i.e. no selection bias). Unsure
The patients in both groups were similar with respect to prognostic factors. No
All participants (patients, clinicians, outcome assessors) were unaware of group allocation. No
All groups were treated equally except for the intervention. No
Follow-up was complete (i.e. at least 80% for both groups). Yes
All patient-important outcomes were considered. No
The treatment effect was large enough and precise enough to be clinically significant. Unsure
Key Results: They screened 667 patients and enrolled 401. The median age was 60 years, 56% were male and half the patients had sepsis or septic shock.
Bag-mask ventilation group had higher oxygen saturations and less severe hypoxemia compared to the control group.
Primary Outcome: Lowest oxygen saturation
96% (interquartile range, 87% to 99%) in the BMV group vs. 93% (interquartile range, 81% to 99%) in the no-ventilation group (P = 0.01).
Secondary Outcome:
21 patients (11%) in the BMV group had severe hypoxemia vs. 45 patients (23%) in the no-ventilation group (relative risk, 0.48; 95% CI: 0.30 to 0.77).
1. Patients: Patients in this study were recruited from seven academic intensive care units (ICUs) in the United States. Eighty percent of the patients were intubated for respiratory failure. While many adult patients in the emergency department are intubated for the same reason many others are intubated of cardiac arrest and trauma depending on your place of practice. It is unclear if this study population has external validity outside the ICU and to the emergency department.
Another thing about the patients who were excluded. The study did not enroll those patients judged to be a very high risk of desaturation or aspiration, had hypoxemia, or had acidemia. These patients are ones that we potentially care more about when it comes to peri-intubation oxygenation and ventilation, so it is difficult to say if these results are generalizable to this population.
2. Consecutive Patients: They claim that patients were recruited consecutively. However, selection bias could have been introduced. Patients could be excluded if they required immediate intub ation or if the treating clinicians felt that ventilation was indicated or contraindicated between induction and laryngoscopy.
This is pragmatic but it does introduce subjectivity into the process and could have resulted in bias. It is unclear if this would have any meaningful impact on the results.
3. Prognostic Factors: A quality indicator for an RCT is that both the intervention group and control group are similar with regards to prognostic factors. There were statistical differences between the two groups with 10% more patients having pneumonia and 6% less having a gastrointestinal bleeding in the control group.
4. Treated Equally: Another quality indicator is that both groups are treated equally except for the intervention. That was not the case in this trial. The BMV group was more likely to be preoxygenated with a BMV (40% vs 11%) while the no ventilation group was more likely to be preoxygenated with NiPPV (24% vs 16%). Preoxygenation can have an impact on likelihood of desaturation during intubation.
Note: The BMV ventilation in this trial was extremely well done. The providers in the trial were trained to provide appropriate rates, volumes, and adequate mask seal. This is not typical in most emergency departments.
5. DOOs, MOO and POO: Their primary and secondary outcomes were disease-oriented outcomes (DOOs) or monitor-oriented outcomes (MOOs). The median lowest oxygen saturation and incidence of severe hypoxia are surrogate markers and do not represent a patient-oriented outcome (POO).
They did look at a number of exploratory-oriented outcomes (EOO) for safety (ex. aspiration, new opacity on chest x-ray and cardiac arrest) and efficacy (ex. mortality, days in ICU and ventilator-free days). However, they did not include what could be considered the most important POO, survival with good neurologic outcome.
Comment on Authors’ Conclusion Compared to SGEM Conclusion: We generally agree with the authors’ conclusions but would also add that a statistical difference in a DOO does not necessarily translate into a clinically important POO.
SGEM Bottom Line: It is unclear if bag-mask ventilation in critically ill adult patients requiring intubation provides a clinically important benefit or is safe.
Case Resolution: Because the patient is at high risk of desaturation during intubation, you make a plan that optimizes preoxygenation. You use your clinical judgment and provide gentle, controlled bag-mask ventilation during the apneic period to prevent desaturation.
Clinical Application: Due to the multiple limitations identified in this trial it is difficult to know how to clinically apply this data.This is a common problem faced by clinicians practicing evidence-based medicine. The literature informs and guides our care but should not dictate our care. When we do not have definitive literature for efficacy or safety we must rely more upon our clinical judgement. In addition, we do not know if BMV will result in a clinically important outcome (survival with good neurologic outcome). This does not mean we should not perform very good preoxygenation prior to intubation.
What Do I Tell My Patient? You have pneumonia and it is making it difficult for you to breath. We can help by putting a tube in your throat. This will make it easier to breath and give time for the antibiotics to work. This can be scary. Before we would put the tube down your throat you would get some extra oxygen. Then, if you say OK to the tube, you will get some medicine to relax you and so you will not remember the experience. We will do everything possible to make sure this is successful and there are no complications.
Keener Kontest: There was no winner last week. The correct answer is Michigan is a Native American word meaning Great Water.
Listen to the podcast this week. If you know the answer to the trivia question then send me an email to TheSGEM@gmail.com with “keener” in the subject line. The first correct answer will receive a cool skeptical prize.
Other FOAMed:
First10EM: PreVent Trial
EM Nerd: The Case of the Conspicuous Conclusion
REBEL EM: PreVent BMV Prior to Intubation
The Resus Room: Managing the Apneic Period - The PreVent Trial
St. Emlyn's: Ventilation During RS

Jan 22, 2020 • 10min
SGEM Xtra: It’s All About the Bayes, ‘Bout the Bayes, No Fisher
Guest Skeptic: Dr. Dan Lane has a Masters in Health Services Research at the University of Calgary, a Doctor of Philosophy in Clinical Epidemiology from the University of Toronto and is currently a medical student at the University of Calgary.
Dan is naturally a contrarian, he strives to understand first principles of conventions in medical research in order to identify and challenge poor practices that have become dogma. He is passionate about statistics and epidemiology and wants to share that passion by making these topics more practical and approachable for clinicians. Believing the key to proper interpretation of medical research does not begin with memorizing some arbitrary threshold for statistical significance, Dan hopes to contribute to the SGEM through sharing an understanding of what story the numbers are actually telling about the data. Dan has no funding whatsoever, and no associations with industry. He is currently a medical student at the University of Calgary.
Dan has some pet peeves when it comes to statistics there used and critical appraisals. We will do some more in depth SGEM Xtras on each of these issues.
Thomas Bayes
Absolute vs. Relative Estimates
Effect Estimates and Not P-Values
All Models are Wrong
Predication vs. Classification
Bayes No Frequentists
The purpose of this SGEM Xtra, beside to introduce a new SGEM faculty member, is also to announce we are adding a new segment to the SGEM. It is going to be called Statistically Significant.
We want to make the SGEM even better and address some of the criticisms from the ClinEpi world about clinicians trying to do critical appraisal. In order to do that we now have a Dr. Dan Lane PhD who will be commenting on each the SGEM episodes.
The first instalment of Statistically Significant segment will be on this weeks’ SGEMHOP looking at troponin testing in the elderly patients presenting with non-specific complaints (SGEM#280). Let me know what you think of this idea. We have a few more lined up and feedback is always appreciated. Send me an email TheSGEM@gmail.com
Statistically Significant #280: Sensitivity and Specificity
Despite their dogmatic use in the literature, sensitivity and specificity have a number of limitations that are rarely considered or addressed in diagnostic test studies.
Sensitivity and Specificity are crude metrics, meaning they only look at the effect of a single measure and a single outcome. As crude measures they fail to incorporate any other information into their estimates, including potential confounders for the relationship between the test result and the outcome.
In this particular study, age is part of the primary objective for the study (geriatric patients) but is also a confounder of the relationship between troponin level (which may increase with age) and acute coronary syndrome risk (also increases with age). When confounders like age are present, crude measures will be influenced based on the prevalence of confounders in each the groups – for example, if there were more older patients in the troponin positive group, the estimates for sensitivity may be inflated.
Another limitation of sensitivity and specificity is they require a test result be classified as positive or negative. This is problematic when the real measure is a continuous measure, such as troponin. In the current study the test was considered “positive” if the troponin level was above the 99th percentiles for that enzyme. But this arbitrarily treats patients above or below the 99th percentile as homogeneous groups, meaning the statistics consider everyone above the threshold to be the same, and everyone below the threshold to be the same.
Consider a patient with a troponin right below the threshold and another patient right above the threshold – surely these patients are almost identical in terms of their risk for having ACS. But by inserting an arbitrary break into the measure, the statistics will treat them as different resulting in more misclassifications simply because a threshold for positive or negative was selected.
Instead of these binary classifications, researchers could focus directly on the patient’s risk of the outcome. This can be represented using probabilities and a smooth curve that shows the probability of ACS based on the exact troponin value. Using simple statistical models, these probability estimates can be adjusted for confounders, like age, and provide easily interpretable probability estimates for the entire range of troponins – no classification required!
References:
Amrhein, Greenland and McShare. Scientists rise up against statistical significance. Nature 2019
Reginal Nuzzo. STATISTICAL ERRORS. P values, the ‘gold standard’ of statistical validity, are not as reliable as many scientists assume. Nature 2014
Fatovich and Phillips. The probability of probability and research truths. AEM 2017
Greenland et al. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. EJE 2016
Guggenmoos-Holzmann and van Houwelingen. The (In)Validity of sensitivity and specificity. Statistics in Medicine 2000
Remember to be skeptical of anything you learn, even if you heard it on the Skeptics' Guide to Emergency Medicine.

Jan 18, 2020 • 30min
SGEM#280: This Old Heart of Mine and Troponin Testing
Date: January 16th, 2020
Reference: Troponin Testing and Coronary Syndrome in Geriatric Patients With Nonspecific Complaints: Are We Overtesting? AEM January 2020
Guest Skeptics:
Dr. James VandenBerg: James has a master’s degree in clinical investigation from Washington University in St. Louis, and is currently the Chief Resident at Detroit Receiving Hospital.
Dr. Andrew Huang: Andy is the Chief Resident at Sinai-Grace Hospital.
Case: As the resident, you have just finished seeing a 78-year-old male who has been brought in by his family over the holidays. The triage nurse has put the reason for the visit as “multiple complaints”. Despite spending 30 minutes in the room, you still are not sure exactly why the patient is here.
Your attending says that if you take a good geriatric history that you can always determine what’s going on. However, 15 minutes later your attending leaves the room defeated. The patient’s complaints are just so nonspecific.
The attending ends up ordering the “geriatrogram” – ticking off every blood test on the form, including the troponin. You turn to the attending and ask, “do you really think this could be acute coronary syndrome (ACS)?”
Background: Patients 65 years and older account for about 15% of emergency department visits in the United States. Their presentations are often complicated as they present with nonspecific symptoms, and there is often obscuring co-morbid conditions, polypharmacy, and cognitive/functional impairment.
Nonspecific symptoms in the elderly usually yield a broad differential and there are no recommended diagnostic algorithms, leading to extensive testing. ACS is usually amongst this differential, as cardiovascular disease is a leading cause of morbidity and mortality in this population.
Additionally, the elderly population with ACS more commonly presents without chest pain compared to younger patients (up to 20% of elderly patients with MI present with “weakness” as part of their chief complaint). While cardiovascular disease is the leading cause of mortality and morbidity in the elderly, the frequency of ACS amongst this population presenting with nonspecific symptoms is unknown.
Clinical Question: What is the frequency of ACS in elderly patients presenting to the ED with nonspecific complaints, and what is the utility of troponin testing in this population?
Reference: Wang et al. Troponin Testing and Coronary Syndrome in Geriatric Patients With Nonspecific Complaints: Are We Overtesting? AEM January 2020
Population: Patients aged 65 years and older presenting to the emergency department with nonspecific chief complaints who underwent troponin testing. “Nonspecific” was designed a priori as including weak or weakness, dizzy or dizziness, fatigue, lethargy, altered mental status, light-headedness, medical problem, examination requested, failure to thrive, or “multiple complaints.”
Exclusions: If they had a focal chief complaint (ex. focal pain, injury complaint, shortness of breath, vomiting, diaphoresis, syncope, fever, cough, focal neurologic deficit)or fever of at least 38C at triage.
Investigation: Troponin testing
Comparison: None
Outcomes: There were multiple outcomes of interest:
The proportion of patients with nonspecific complaints who underwent troponin testing.
The proportion of such patients who had elevated troponin.
The proportion of patients with ACS at the index visit or within 30 days.
The utility of troponin testing to diagnose or exclude ACS.
The frequency of other causes of troponin elevation in this population.
Dr. Alfred Wang
This is a LIVE episode of an SGEMHOP which means we have the lead author on the show. Dr. Alfred Wang is an emergency medicine physician at Indiana University in Indianapolis, IN. With the help from a dedicated team of physician-peers and mentor, Dr. Wang was able to complete this research project.
Authors’ Conclusions: “While consideration for ACS is prudent in selected elderly patients with nonspecific complaints, ACS was rare and no patients received reperfusion therapy. Given the false-positive rate in our study, our results may not support routine troponin testing for ACS in this population.”
Quality Checklist for A Chart Review: There is a quality check list for ED studies that was published by Gilbert et al in Annals of EM 1996. It had eight items. The list was updated and expanded by Dr. Andrew Worster from BEEM to include 12 items.
The authors of this retrospective chart review did a great job and 11 out of 12 answers were yes. The only “no” was that they did not have a management plan described for missing data in the publication.
Abstract Training: Were the abstractors trained before the data collection? Yes
Case Selection Criteria: Were the inclusion and exclusion criteria for case selection defined? Yes
Variable Definition: Were the variables defined? Yes
Abstraction Forms: Did the abstractors use data abstraction forms? Yes
Performance Monitored: Was the abstractors’ performance monitored? Yes
Binding to Hypothesis: Were the abstractors aware of the hypothesis/study objectives? Yes
Inter Rater Reliability (IRR) Mentioned: Was the interobserver reliability discussed? Yes
IRR Tested: Was the interobserver reliability tested or measured? Yes
Medical Record Identified: Was the medical record database identified or described? Yes
Sampling Method:Was the method of sampling described? Yes
Missing Data Management Plan: Was the statistical management of missing data described? No
Institutional Review Board Approved: Was the study approved by the institutional or ethics review board? Yes
A chart review is a type of observational study. We do have an SGEM quality check list for observational studies.
Quality Checklist for Observational Study:
Did the study address a clearly focused issue? Yes
Did the authors use an appropriate method to answer their question? Yes
Was the cohort recruited in an acceptable way? Yes
Was the exposure accurately measured to minimize bias? Unsure
Was the outcome accurately measured to minimize bias? Yes
Have the authors identified all-important confounding factors? Unsure
Was the follow up of subjects complete enough? No
How precise are the results? Precision was poor. The 95% confidence interval for sensitivity was 48-100%. Spec was better at 77-85% but we must remember these measures are CORRELATED, and therefore the poor sensitivity is also a reflection on Specificity. Had they picked a different cut-off for troponin then they could have improved the sensitivity (at a cost to the specificity)
Do you believe the results? Yes
Can the results be applied to the local population? Unsure
Do the results of this study fit with other available evidence? Unsure
Key Results: They initially identified 1,146 potentially eligible patients. After excluding the patients who had a specific complaint listed and those with documented fever, they were left with a total of 594 patients. Of those, 69% had troponins ordered. The average age of the cohort was 78 years old, 58% were female, and 75% were admitted. The most common chief complaints were altered mental status (43%), weakness/fatigue (33%), and dizziness (21%).
The proportion of patients with nonspecific complaints who underwent troponin testing: 412/594 (69%)
The proportion who had an elevated troponin in the ED: 52/412 (12.6%) (Another 30 patients had an elevated troponin at some point during their hospital stay)
The proportion of patients with ACS at the index visit or within 30 days: 5/412 (1.2%) All occurred during the index admission.
The utility of troponin testing to diagnose or exclude ACS. Looking only at the first troponin in the ED, it was 80% sensitive and 88% specific (NPV = 99.7%, PPV = 7.7%) for ACS. The LR+ was 6.67, and LR– was 0.23. Considering all troponins, the sensitivity was 100% (95% CI = 48%–100%), the specificity was 81% (95% CI = 77%–85%), the NPV was 100%, and the PPV was 6.1%.
The frequency of other causes of troponin elevation in this population. There was a long list of non-ACS causes of troponin elevation. The top 3 causes were: dehydration, heart failure, and atrial fibrillation.
We asked Dr. Wang ten questions to get a greater understand of his publication. Listen to the SGEMHOP podcast to hear all of Dr. Wang's answers.
Dr. James VandenBerg
Defining “Non-Specific”: The definition of “non-specific” symptoms is problematic while at the same time being pragmatic. For instance, “dizzy” could be construed as non-specific, but what if the patient had supporting focalized neurologic complaints? Additionally, some physicians list the chief complaint as the leading sentence a patient provides. This is problematic if a patient initially cites a “non-specific” complaint, but then describes suggestive ACS symptoms in their HPI. Conversely, “focal” chief complaints such as “shortness of breath” can be construed as non-specific in real practice based on the patient’s HPI, but due to the paper’s inclusion criteria, if any triage nurse or physician labeled a chief complaint as “focal” they would be excluded.
Chief Complaints Not Equal: Definitions of nonspecific included a spectrum of complaints, from altered mental status to failure to thrive. I imagine the yield of testing is much higher in altered mental status than it is in failure to thrive. Would there be a benefit of considering these chief complaints separately?
Retrospective Charting: You excluded patients who had nonspecific complaints at triage, but had a focal complaint listed in the ED physician note. The ED physician note might have been written after the troponin result was known. In the presence of a positive troponin, focal complaints might have been emphasized, despite being originally nonspecific.

Jan 11, 2020 • 33min
SGEM#279: Do You Really Want to Hurt Me and Use a Placebo Control for a Migraine Trial?
Date: January 10th, 2020
Reference: Dodick DW et al. Ubrogepant for the Treatment of Migraine. NEJM 2019
Guest Skeptic: Dr. Anand Swaminathan is an Assistant Professor of Emergency Medicine at St. Joseph’s Hospital in Paterson, NJ. He is also the managing editor of EM:RAP and associate editor at REBEL EM.
Case: A 23-year-old man with a history of migraines presents with two days of headache, nausea and photo-photophobia typical of his prior migraines. He’s tried a number of medications at home including ibuprofen, acetaminophen, aspirin and sumatriptan without any considerable improvement in symptoms. You start to offer him your standard medications like metoclopramide and haloperidol when he asks about a new drug he heard about called ubrogepant.
Background: Migraine headaches are a chronic neurologic disease characterized by throbbing, often unilateral headaches that are often associated with nausea, vomiting, photophobia and phonophobia. It is a common disease and can be severe enough to impede on people’s lives.
Headaches themselves are not only a common emergency department presentation but one that is filled with potential dangers. There are a number of causes of headache that are life and limb threatening – subarachnoid hemorrhage (SGEM#201), meningitis, encephalitis, cerebral venous thrombosis, vertebral artery dissection among other things but, most headaches are benign in nature.
There is an international classification system of headaches (IHS 2018). The current system classifies them into primary and secondary headaches. An important part of our job as emergency physicians is to differentiate the lethal headache from the benign headache.
Though we rarely make a de novo diagnose of migraines in the emergency department, many patients with migraines present to us for symptom management. The pathophysiology of migraines is both complicated and poorly understood but there are a number of potential treatments including NSAIDs, acetaminophen, aspirin, neuroleptics, triptans and even propofol.
More recently, calcitonin gene-related peptide antagonists (CGRPs) have emerged as a new potential treatment. The first big study that came out on these drugs was published in the NEJM in 2019 and was entitled Rimegepant, an Oral Calcitonin Gene-Related Peptide Receptor Antagonist for Migraine (Lipton et al).
Now, we have a second study published in the NEJM on a related drug, ubrogepant.
Clinical Question: Does ubrogepant increase the percentage of patients who were free from pain and absent of the most bothersome migraine-associated symptom at two hours from initial dose in comparison to placebo?
Reference: Dodick DW et al. Ubrogepant for the Treatment of Migraine. NEJM 2019
Population: Adult patients (18-75 years of age) with at least a one-year history of migraine with or without aura that met criteria from the International classification of headache disorders and had migraine onset before the age of 50. Patients had to have a history of migraines between 4-72 hours and a history of migraine attacks separated by at least 48 hours of freedom from headache. Additionally, they had to have suffered from two to eight migraines per month over the last three months.
Exclusions: Patients with 15 or more headaches/month on average in the previous six months. Hard to distinguish the type of headache. Use of acute migraine treatment on ten or more days in the previous three months. Participated in a trial involving CGRP. Had clinically significant cardiovascular or cerebrovascular disease. History of hepatitis in the last six months or laboratory findings of liver disease (elevated AST, AST, Bilirubin or low serum albumin).
Additional Exclusions from ClinicalTrials.gov
Has a history of migraine aura with diplopia or impairment of level of consciousness, hemiplegic migraine, or retinal migraine
Has a current diagnosis of new persistent daily headache, trigeminal autonomic cephalgia (eg, cluster headache), or painful cranial neuropathy
Required hospital treatment of a migraine attack 3 or more times in the previous 6 months
Has a chronic non-headache pain condition requiring daily pain medication
Has a history of malignancy in the prior 5 years, except for adequately treated basal cell or squamous cell skin cancer, or in situ cervical cancer
Has a history of any prior gastrointestinal conditions (eg, diarrhea syndromes, inflammatory bowel disease) that may affect the absorption or metabolism of investigational product; participants with prior gastric bariatric interventions which have been reversed are not excluded
Intervention: Ubrogepant 50 mg or 100 mg
Comparison: Placebo
Outcomes:
Co-Primary Outcome: Freedom from pain at two hours from initial dose of medication. Absence of the most bothersome symptom associated with migraine two hours from initial dose of medication.
Secondary Outcomes: Change in severity of headache at two hours, sustained pain relief, sustained freedom from pain, absence of photophobia, absence of photophobia and absence of nausea at two hours from initial dose. Adverse events were also collected.
Authors’ Conclusions:“A higher percentage of participants who received ubrogepant than of those who received placebo had freedom from pain and absence of the most bothersome symptom at 2 hours after the dose. The most commonly reported adverse events were nausea, somnolence, and dry mouth. Further trials are needed to determine the durability and safety of ubrogepant for acute migraine treatment and to compare it with other drugs for migraine.”
Quality Checklist for Randomized Clinical Trials:
The study population included or focused on those in the emergency department. No
The patients were adequately randomized. Yes
The randomization process was concealed. Yes
The patients were analyzed in the groups to which they were randomized. No
The study patients were recruited consecutively (i.e. no selection bias). Unsure
The patients in both groups were similar with respect to prognostic factors. Unsure
All participants (patients, clinicians, outcome assessors) were unaware of group allocation. Yes
All groups were treated equally except for the intervention. Yes
Follow-up was complete (i.e. at least 80% for both groups). No
All patient-important outcomes were considered. No
The treatment effect was large enough and precise enough to be clinically significant. Unsure
Key Results: They enrolled 1,672 patients with roughly equal numbers allocated to each of the three groups. The mean age was around 40 years and almost 90% were female. The modified ITT analysis excluded 345 (21%) of participants.
Ubrogepant was superior to placebo in treating migraine headaches.
Primary Outcomes: (100mg/50mg/placebo)
Freedom from pain at two hours: 21%/19%/12% (both doses statistically better than placebo but, not better than the other). That gives an absolute difference of about 8% and Number Needed to Treat for Benefit (NNTB) of 13
Absence of most bothersome symptom at two hours: 38%/39%/28%. This is an absolute difference of 10% with a NNTB of 10.
Secondary Outcomes:
Pain relief at two hours (61%/61%/49%) and sustained pain relief (38%/36%/21%) was better with ubrogepant compared to placebo.
Serious Adverse Events:There were five SAE with all of them being in the intervention group (two appendicitis, pericardial effusion, spontaneous abortion and seizure). Only the seizure was considered related to the trial drug. Six patients had ALT levels three times the upper limit of normal (one in the placebo group and five in the treatment group). Only one of the treatment group was considered possibly related to the trial regimen. Details are in the supplemental appendix.
1. Patients: We had a few issues with the patients included in this study. First, these were not emergency department patients but rather those recruited from outpatient clinic. Whether or not these are the same patients that present to the emergency department is unknown.
We are also unsure if the patients were recruited consecutively. This is an important aspect to avoid potential selection bias. Remember that when we use the term “bias” we are not talking about random noise in the data but something that systematically moves us away from the “truth”.
The third question we had about the included patients was whether or not both groups were similar with respect to prognostic factors. Baseline demographics are reported in Table 1. However, things like number of headaches/month, refractory headaches in the past, and other things are not reported. This could impact the results and therefore the conclusions.
2. Comparison to Placebo: Randomized control trials (RCTs) are considered an ideal study design to establish causality and effect of a medication. Drug intervention RCT design requires that the intervention be compared to something (active drug, standard treatment, no treatment or placebo).
It is widely agreed upon that comparison to placebo is acceptable when no proven intervention exists (Millum and Grady 2013). In contrast, placebo comparison is not considered acceptable in life-threatening conditions if there is an available treatment that is known to prolong life. The use of placebo for comparison in non-life-threatening conditions has been hotly debated for decades, particularly when an accepted treatment exists.
The argument against the use of placebos in these circumstances is guided by the Declaration of Helsinki. This documents state:
“In any medical study, every patient — including those of a control group, if any — should be assured of the best proven diagnostic and therapeutic methods.”
Thus, if an effective treatment exists, it should be prescribed to patients (Simon 2000).


