The Skeptics Guide to Emergency Medicine

Dr. Ken Milne

Meet ’em, greet ’em, treat ’em and street ’em

Episodes

Mentioned books

Mar 28, 2026 • 33min

SGEM Xtra: You say you want a revolution – well you know – Against the Grain: Defiant Giants Who Changed the World

Date: February 26, 2026 Guest Skeptic: Terry O’Reilly is the host of the long-running and popular podcast Under the Influence. He is also an acclaimed storyteller and book writer. However, Terry is not just some radio host talking about marketing; he was an adman on the front lines, working in the trenches for 35 years in the advertising industry. I’ve been a listener of Under the Influence for a long time, and it’s helped me think about how we communicate with emergency clinicians and how we make ideas memorable without overselling them. I see many similarities with Terry. I’m not just some podcaster talking about emergency medicine. I’ve been working in the emergency department (ED), on the front line, for 31 years. I’m not an academic sitting in an Ivory tower opining on how to practice emergency medicine based on the literature. I worked 17 ED shifts in February. I’m walking the walk while I talk the talk. I think that brings a perspective and credibility to the SGEM, similar to the credibility of what Terry does on Under the Influence. Terry and I met in person with my wife, Barb, and 11-year old son, Ethan, around 2009. Terry was promoting his book The Age of Persuasion: How Marketing Ate Our Culture. We pulled Ethan out of school to go to Sarnia for a day and watch him give a talk. Terry even signed a copy of his book for Ethan. Our son was so inspired by the event and went on to pursue an academic career in Marketing. Ethan will be defending his PhD in Marketing from the Ivey School of Business this spring. Today, we are going to talk about Terry's latest book: Against the Grain: Defiant Giants Who Changed the World. It is a collection of stories about people who challenged the status quo and changed what the rest of us thought was possible. It reminded me of Apple's famous commercial, "Think Different." I made a parody video about rural physicians titled “Here’s to the Crazy Ones”. People may be wondering why this matters to emergency physicians. I think the “against the grain” ethos is common in emergency medicine. We have healthy skepticism and often challenge dogma, based on the evidence, when discussing management with other specialties. We also must be good at persuading patients, families, learners, consultants, and administrators that what we are doing is the right thing. Five Questions for Terry O'Reilly 1) What inspired you to write Against the Grain? Was there a single person/story that sparked the project? What’s your definition of defiant? Did you notice a pattern in how these defiant giants resisted the herd/groupthink? 2) What was one of the most surprising stories you uncovered while researching the book? What surprised you: the person’s personality, the risk they took, or how others reacted? Was there a moment in your researching a story where you thought, “No way this is true”, and then it was? 3) There are four medical stories in the book (Chapter 4). Most SGEMers probably know about Ignaz Philipp Semmelweis. Can you briefly tell us the story of Dr. Katalin Karikó Katalin Karikó and Drew Weissman 2023 Nobel Prize in Medicine for their discoveries concerning nucleoside base modifications that enabled the development of effective mRNA vaccines against COVID-19. Do you think healthcare messaging has unique challenges compared with marketing products? In your view, what’s the difference between educating vs persuading in healthcare? We do need to be careful in science and medicine not to commit the Galileo Fallacy. This is when someone assert a is true or should be given more credibility because the person making the claim has been prosecuted or otherwise mocked. This fallacy originates from Galileo Galilei's famous persecution by the Roman Catholic Church for his defence of heliocentrism, when the commonly accepted belief at the time was an earth-centred universe. The truth is independent of whether the person is being mocked/persecuted, as with Semmelweis. What matters is the objective, verifiable evidence and logical arguments. 4) What has the feedback been like on the book tour so far? Which types of readers are connecting most with it? Have any audience questions surprised you? Has anyone pushed back on the idea of celebrating “defiance”? 5) What do you hope the audience learns after reading the book? If you had to boil it down, what should we be more skeptical of? How do we encourage against-the-grain thinking without sliding into cynicism? The SGEM will be back next episode with a structured critical appraisal of a recent publication. Our goal is to reduce the knowledge translation (KT) window from over 10 years to less than 1 year using the power of social media. So, patients get the best care, based on the best evidence. Remember to be skeptical of anything you learn, even if you heard it on The Skeptics’ Guide to Emergency Medicine. Previous SGEM Xtra Book Interviews SGEM Xtra – Brian Goldman: The Power of Kindness SGEM Xtra – Tim Caulfield: Illusion- What you Don't Know and Why It Matters SGEM Xtra – Steven Novella: The Skeptics Guide to the Universe SGEM Xtra - Tim Caulfield: Relax – Damm It! SGEM Xtra - Mel Herbert: The Extraordinary Power of Being Average SGEM Xtra - Brian Goldman: Casino Shift - Stories from an ER on the Edge (coming soon) SGEM Xtra - Darren McKee: Uncontrollable - The Threat of Artificial Superintelligence and the Race to Save the World

Feb 21, 2026 • 25min

SGEM#504: Home Where I Wanted to Go After Anaphylaxis

Reference: . Timing of repeat epinephrine to inform paediatric anaphylaxis observation periods: a retrospective cohort study. Lancet Child & Adolescent Health. July 2025 Dr. Kammeron Brissett Guest Skeptic: Dr. Kammeron Brissett is a pediatric emergency medicine fellow at Children’s National Hospital in Washington, DC. She completed her pediatrics residency and a chief year at Rainbow Babies and Children’s Hospital in Cleveland, Ohio. Her interests include injury prevention, social determinants of health, and advocacy. Case: A 7-year-old boy with a peanut allergy presents to the emergency department (ED) after eating a cookie at a birthday party. Shortly afterwards, he developed hives and wheezing. His parents gave him an epinephrine auto-injector to improve his symptoms. In the ED, he feels much better. His vital signs are normal, and his lungs are clear. He has no other gastrointestinal or cardiovascular symptoms. The parents tell you, “Unfortunately, we’ve been through this before. It’s not the first time he has accidentally eaten something that may have had some peanuts in it. Last time, we sat in the ED for a few hours before going home. It’s been a long day. Can we just go home now?” Background: Anaphylaxis is a serious, potentially life-threatening systemic allergic reaction with a fast onset. It is a clinical diagnosis that should be considered when: Acute illness with skin/mucosal involvement and either respiratory compromise or reduced blood pressure/end-organ symptoms; or Two or more of the following occurring rapidly after exposure: skin/mucosal involvement, respiratory compromise, reduced blood pressure, or persistent gastrointestinal symptoms; or Reduced blood pressure after exposure to a known allergen for the patient. Early recognition and treatment with intramuscular epinephrine is crucial. Sometimes, even after initial symptom improvement with IM epinephrine, anaphylaxis symptoms can recur even without exposure to the known trigger. This is called a biphasic reaction and can happen up to 72 hours later. The SGEM discussed anaphylaxis and biphasic reactions 13 years ago on SGEM#57. The bottom line was that prolonged observation is likely unnecessary in patients whose symptoms resolve with therapy in the ED. Biphasic reactions are rare and can occur anywhere from 10 minutes up to 6 days. We already have problems with boarding and overcrowding. We can’t keep all patients with anaphylaxis for 6 days. So, when can we send them home? Traditionally, ED observation after anaphylaxis has been around 4 to 6 hours to monitor for biphasic reactions. The Resuscitation Council UK recommends a risk-stratified approach: A patient can be discharged after 2 hours when there’s a good response to a single dose of epinephrine, the symptoms have resolved, the child and family has another epinephrine autoinjector and knows how to use it, and has adequate supervision after discharge. They recommend at least 6 hours of observation if two IM doses of epinephrine were needed or there was a prior biphasic reaction. Finally, they recommend at least 12 hours observation if there was severe respiratory compromise, >2 doses of epinephrine, ongoing allergen absorption, late-night presentation/limited access to care, or difficult access to emergency services. The National Institute for Care and Health Excellence (NICE) is even a bit more conservative, recommending any child under age of 16 with suspected anaphylaxis be admitted. What about in the US? In the United States, the 2023 AAAAI/ACAAI Joint Task Force Practice Parameter (JTFPP) emphasizes individualized, risk-based observation and shared decision-making, noting that risk for biphasic reactions is higher with more severe initial reactions and when >1 dose of epinephrine is required. It also highlights that patients with a prompt, complete, and durable response to epinephrine may not always require activation of EMS or prolonged monitoring, underscoring tailored disposition planning. Clinical Question: Among children treated with epinephrine for anaphylaxis, what is the timing and incidence of repeat epinephrine that could inform safe observation periods? Reference: . Timing of repeat epinephrine to inform paediatric anaphylaxis observation periods: a retrospective cohort study. Lancet Child & Adolescent Health. July 2025 Population: Children 6 months to 17 years presenting to 31 EDs (30 US, 1 Canada) with an acute allergic reaction treated with epinephrine from 2016 to 2019. Excluded: Transfers from outside facilities, ED medication-induced reactions, missing pre-ED symptom documentation; comorbidities requiring tailored management Intervention: ED observation following the first epinephrine dose and need for additional epinephrine Comparison: Comparisons were made across severity strata (no respiratory/cardiovascular involvement vs respiratory involvement only vs cardiovascular involvement). Outcome: Primary Outcome: Time from first to last epinephrine dose (repeat epinephrine as a proxy for clinically significant ongoing/recurrent reaction). Secondary Outcomes: Biphasic anaphylaxis and non-anaphylaxis, persistent anaphylaxis and non-anaphylaxis, refractory anaphylaxis, other return-care outcomes Trial: Multicenter retrospective cohort Authors’ Conclusions: “A 2-h observation period is probably safe for most children who present to an emergency department with an acute allergic reaction requiring epinephrine. A 4-h observation period might be enough for patients with cardiovascular involvement who appear well.” Quality Checklist for Observational Study: Did the study address a clearly focused issue? Yes Did the authors use an appropriate method to answer their question? Yes Was the cohort recruited in an acceptable way? Yes Was the exposure accurately measured to minimize bias? Unsure Was the outcome accurately measured to minimize bias? Unsure Have the authors identified all-important confounding factors? Unsure Was the follow-up of subjects complete enough? Unsure How precise are the results? Unsure Do you believe the results? Yes Can the results be applied to the local population? Yes Do the results of this study fit with other available evidence? Yes Funding of the Study: National Center for Advancing Translational Sciences and The National Institute of Allergy and Infectious Diseases of the National Institutes of Health. The funders had no role in study design, data collection, data analysis, interpretation, or writing of paper. Two of the authors report receiving consultant fees. One is on the advisory board and gets stock options from biotech companies and royalty fees from the publisher. Results: They included 5,641 eligible children with a median age of 7.9 years, with slightly more males (56%). 4956 (88%) fulfilled the National Institute of Allergy and Infectious Diseases and Food Allergy and Anaphylaxis Network criteria for anaphylaxis. In that group, 1.5% met criteria for biphasic anaphylaxis and 10.7% had persistent anaphylaxis. 4.7% received repeat epi after 2 hours from initial dose. 1.9% received repeat epi dose after 4 hours. Patients with cardiovascular involvement had higher rates of biphasic anaphylaxis. Key Results: Around 95% of children can be safely discharged after 2 hours of observation without the need for additional epinephrine. Among all patients, 5% received a repeat dose of epinephrine after 115 minutes. There were differences in patients with or without respiratory or cardiovascular involvement. Primary Outcome: In the entire cohort, 4.7% received epi 2 hours after the initial dose, 1.9% received epi after 4 hours, 1.1% received epi after 6 hours, and 0.8% received epi after 8 hours. Secondary Outcomes: 86 (1.5%) had biphasic anaphylaxis 236 (4.2%) had biphasic non-anaphylactic allergic reactions 605 (10.7%) had persistent anaphylaxis 1400 (24.8%) had persistent non-anaphylactic allergic reactions 118 (2.1%) had refractory anaphylaxis Diagnosis of Anaphylaxis We mentioned that anaphylaxis is a clinical diagnosis, but it’s not always clear-cut. In this retrospective review, the authors used ICD-10 codes and chart reviews to determine whether patients experienced anaphylaxis. They included patients who were treated with intramuscular, subcutaneous, or intravenous epinephrine. Potential biases include selection bias, information bias, and misclassification bias. Not all the patients included in this study actually met criteria for anaphylaxis, which is acknowledged by the authors. Anaphylaxis Practice Guideline update in 2023 states, “treatment with epinephrine or clinical response to epinephrine should also not be used as a surrogate marker to establish a diagnosis of anaphylaxis because there are many cases in which patients receive epinephrine for milder reactions.” Some of these patients were included because authors reported that “the administration of epinephrine might have mitigated reaction progression.” Appendix Table 3, which examines interrater reliability for agreement on anaphylaxis identification, reports kappa values ranging from 0.68 to 0.76, indicating substantial agreement but not perfect agreement. Repeat Epinephrine The primary outcome for this study was the time from first to last administration of epinephrine. We must be careful and state that this is not the equivalent of a biphasic reaction. The decision to administer a repeat dose of epinephrine is also not always clear-cut. It is pragmatic. The clinician may have decided to administer another dose of epinephrine despite the patient not meeting the exact definition of anaphylaxis or a biphasic reaction. Epinephrine may have been administered because the child exhibited concerning signs or symptoms. For example,

Feb 14, 2026 • 56min

SGEM#503: Waiting is the Hardest Part – Factors Associated with ED LOS

Date: February 13, 2026 Reference: Lang et al. Factors associated with emergency department length of stay in Alberta: a study of patient-, visit-, and facility-level factors using administrative health data. CJEM. 2026 Jan 29. Guest Skeptic: Dr. Paul Parks is an emergency physician from Medicine Hat, Alberta. He has been the President of the Alberta Medical Association (AMA) Section of Emergency Medicine for many years, the AMA Board of Directors for 9 years, and the Previous President of the Alberta Medical Association. Paul has won the Canadian Association of Emergency Physicians (CAEP) National Teacher of the Year Award and the CAEP Alan Drummond National Advocacy Award. Case: A 78-year-old man with congestive heart failure (CHF) and chronic obstructive pulmonary disease (COPD) arrives at the emergency department (ED) by ground emergency medical services (EMS) at 15:30 with dyspnea and hypoxia. He’s triaged Canadian Triage and Acuity Scale (CTAS) 2, needs non-invasive ventilation (NIV), diuresis, labs, chest x-ray, and likely admission. The department is packed; multiple admitted patients are boarded in hallway spaces because inpatient beds are unavailable, and nursing assignments are stretched. The patient is placed in the “EMS-PARK” area, which is an extension of the waiting room, and part of a mandatory EMS offload policy. Workup is done while the patient is still technically in the waiting room. The workup and disposition decision happen within a few hours, but transfer to an inpatient bed doesn’t occur until 2-3 days later. Background: ED length of stay (LOS) can be considered a vital sign of ED operations and the broader acute-care system. When LOS rises, it often signals that the ED is no longer functioning as a short-stay diagnostic and stabilization unit but is serving as a buffer for upstream demand and downstream capacity issues. The consequences are not just operational (hallway beds, delayed assessments, delayed analgesia, delayed imaging), but also human. We covered a study that showed for older patients, one overnight stay in the ED waiting for an inpatient bed was associated with a 4% absolute increase in mortality (SGEM#424). In addition, increasing LOS can lead to clinician burnout and moral injury. LOS is also tricky because ED crowding is rarely a single-point failure within the ED. Modern crowding frameworks (often summarized as input–throughput–output) remind us that while ED processes matter, some of the most powerful determinants are output constraints. This is especially true when there is access block and inpatient bed scarcity. In other words, you can run an efficient front-end, but if admitted patients cannot be moved to inpatient beds, the system backs up, and ED LOS climbs. As one concrete example of the output challenges many provinces struggle with, in Alberta, 1/3 of our acute hospital capacity, or about 30%, can be occupied by Alternate Level of Care patients. These alternative level of care (ALC) patients have had their acute care needs met, but they cannot be safely discharged from the hospital without specific continuing care resources – home care, assisted living, or long-term care. We’ve talked about ED crowding on an SGEM Xtra. It covered some of the Zombie Ideas that have been circulating around for decades. The classic one is to blame non-urgent patients for using the ED. They are not responsible for ED crowding. Diverting non-urgent patients away can be dangerous and won’t solve the underlying problem. CAEP published a position statement on emergency department overcrowding in 2013. CAEP argued for nationally standardized performance benchmarks. The statement also called for system-level solutions to improve flow while recognizing that ED optimization alone cannot solve crowding without hospital-wide and community-wide action. While CAEP’s advocacy has influenced awareness, policy discussion, and accountability framing, significant problems continue into 2026. Clinical Question: Across Alberta ED visits, what patient-, visit-, and facility-level factors are associated with longer ED length of stay? Reference: Lang et al. Factors associated with emergency department length of stay in Alberta: a study of patient-, visit-, and facility-level factors using administrative health data. CJEM. 2026 Jan 29. Population: ED visits drawn from linked Alberta Health Services administrative data for 14 ED facilities in Alberta, covering May 2022 to March 2023. Exposures: Factors such as age, deprivation measures, EMS arrival, triage acuity (CTAS), primary care continuity, time/day patterns, and facility-level constraints, including emergency inpatient pressure and hospital occupancy; staffing signals (hours worked per nurse) were also examined. Comparison:Between levels of each exposure, typically relative to a reference category or per-unit change (hospital occupancy, EMS vs non-EMS arrival, different facility types, weekday vs weekend, etc.). Outcomes Primary Outcome:ED total length of stay (LOS). Secondary Outcomes: There were no clearly prespecified secondary outcomes; however, the analysis was stratified by disposition (admitted vs discharged vs other = LWBS, Left AMA, transferred, or died), which functions like a planned subgroup/stratified analysis rather than a distinct secondary endpoint. Type of Study: This is an observational cross-sectional study using population-based administrative data. Authors’ Conclusions: “ED length of stay is associated with modifiable factors, including hospital capacity constraints, hours worked per nurse, and healthcare access inequities. Addressing hospital occupancy, optimizing staffing, and improving care coordination across the patient trajectory—such as between the ED, inpatient units, and post-discharge services—may enhance ED efficiency and reduce prolonged stays. Our findings align with established frameworks describing ED overcrowding and support targeted, system-level interventions to improve the efficiency of emergency care.” Quality Checklist for Observational Studies (Yes/No/Unsure) Did the study address a clearly focused issue? Yes Did the authors use an appropriate method to answer their question? Yes Was the cohort recruited in an acceptable way? Unsure Was the exposure accurately measured to minimize bias? Unsure Was the outcome accurately measured to minimize bias? Unsure Have the authors identified all-important confounding factors? No Was the follow-up of subjects complete enough? N/A How precise are the results? Very precise due to a large sample size, resulting in narrow confidence intervals for several of the point estimates. Do you believe the results? Yes Can the results be applied to the local population? Unsure Do the results fit with other available evidence? Yes Who funded the trial? The authors acknowledge support under the Alberta Atlas of Healthcare Variation initiative. Did the authors declare any conflicts of interest? Brian R. Holroyd was the Senior Medical Director of the Emergency Strategic Clinical Network of Alberta Health Services at the start of this work. Matthew Pietrosanu was employed by Alberta Health Services for statistical consulting, technical writing, and general advising in the Alberta Atlas of Healthcare Variation initiative, which was expanded to include the preparation of this manuscript. Results: The dataset included 587,419 ED visits. The median age was 38 years, and 52% were female. Most patients were discharged (68%), with 18% being admitted and 14% left without being seen, left AMA, transferred, or died. The median ED LOS was 3.1 hours overall, and LOS differed substantially by disposition (admitted patients had a much longer median LOS than discharged patients). Key Result: Facility- and system-level constraints were strongly associated with ED LOS, especially among admitted patients. The more emergency inpatient hours and higher hospital occupancy were associated with longer stays. Primary Outcome: Across all disposition categories, several patient-level factors were consistently associated with longer ED LOS, including older age, higher material or social deprivation, and arrival by EMS (ground or air). At the visit level, higher triage acuity and certain temporal factors (weekend admissions) were also associated with prolonged LOS, particularly among admitted patients. However, the largest and most clinically meaningful associations were at the facility level. Measures of hospital capacity strain dominated the results. Higher hospital inpatient occupancy and a greater number of emergency inpatients boarding in the ED were strongly associated with longer LOS, especially for admitted patients. For admitted patients, a one–standard deviation increase in hospital occupancy (approximately 0.11) was associated with a 17% increase in ED LOS, an effect size that dwarfed most patient- and visit-level predictors. This finding strongly supports the concept of access block (outflow from the ED) as the primary driver of prolonged ED stays. Higher hours worked per nurse were associated with shorter ED LOS in initial models, suggesting a potential staffing effect. However, this association disappeared after accounting for facility-level clustering, indicating that staffing effects may reflect broader organizational or structural differences between hospitals rather than a simple linear relationship with nursing hours. 1) Cross-Sectional Design & Temporality: The biggest design constraint is that this is a cross-sectional observational analysis. Exposures and outcomes are assessed within the same time frame. This means the direction of association can be unclear and may be difficult to determine. 2) Selection Bias: Although the dataset is large, it is not all Alberta EDs.

Feb 7, 2026 • 33min

SGEM#502: Playing with the Queen of Hearts – AI, Is It Very Smart (for ECG Interpretation)?

Date: January 3, 2026 Reference: Shroyer et al. Accuracy of cath lab activation decisions for STEMI-equivalent and mimic ECGs: Physicians vs. AI (Queen of Hearts by PMcardio). Am J Emerg Med. 2025 Nov. Guest Skeptic: Dr. Amal Mattu has been on the faculty at the University of Maryland since 1996. He has developed an academic niche in emergency cardiology and electrocardiography, and he also enjoys teaching and writing on other topics, including emergency geriatrics, faculty development, and risk management. Amal is currently a tenured professor and Vice Chair of Emergency Medicine at the University of Maryland School of Medicine, and a Distinguished Professor of the University of Maryland-Baltimore. Case: A 58-year-old man with diabetes and hypertension arrives at the emergency department (ED) 30 minutes after the sudden onset of substernal chest pressure radiating to the left arm, now improved to 3/10. His vital signs are BP 146/88, HR 92, RR 18, O2 sat 98% on room air. The initial 12-lead ECG shows RBBB with left anterior fascicular block and subtle anterior ST‑depression with proportionally tall, broad T waves in V2 to V4. This is an appearance that can be seen with Hyper-Acute T Wave Occlusive Myocardial Infarction (HATW‑OMI) or an ST-Elevated Myocardial Infarction (STEMI)‑mimic in conduction disease. A debate ensues between emergency medicine and cardiology on whether to activate the cath lab now or get troponins plus serial ECGs? Background: Emergency physicians need to be experts at interpreting ECGs. For decades, we’ve been taught STEMI criteria, only to learn repeatedly that important exceptions exist (posterior OMI, de Winter, hyperacute T waves, modified Sgarbossa in LBBB, etc.). Those exceptions have evolved into two distinct categories. There are the STEMI‑equivalents (OMI without classic ST‑elevation) and STEMI‑mimics (ST‑elevation without OMI). That expanding exception list increases diagnostic complexity and uncertainty. This is the area where artificial intelligence (AI), utilizing computer vision and machine learning, could provide a benefit. ECG-specific AI models now aim squarely at this problem. The study we are reviewing today evaluated the Queen of Hearts (QoH) AI. It is a deep neural network trained to detect occlusive myocardial infarction (OMI) on 12-lead ECGs. The model is described as “91% accurate” in prior work and is undergoing FDA review as of March 24, 2025, but whether it outperforms practicing clinicians on the hardest cases (STEMI‑equivalents and mimics) remained unclear. ECG diagnostic accuracy is important in emergency medicine because misclassification cuts both ways. Missed OMI delays reperfusion, while overcalls send patients and teams to the cath lab unnecessarily, putting patients at risk and using up valuable resources. A diagnostic aid that catches true positive OMIs while reducing false activations could improve outcomes and team throughput. Clinical Question: Among EM physicians and cardiologists interpreting STEMI‑equivalent and STEMI‑mimic ECGs, how accurate are they compared with a machine‑learning ECG algorithm? Reference: Shroyer et al. Accuracy of cath lab activation decisions for STEMI-equivalent and mimic ECGs: Physicians vs. AI (Queen of Hearts by PMcardio). Am J Emerg Med. 2025 Nov. Population: 53 emergency physicians and 42 cardiologists from a community system. Intervention: Human interpretation and QoH AI algorithm classifying each ECG as OMI requiring immediate CLA vs not Comparison (Reference Standard): OMI Present: Angiographic culprit with ≤TIMI II flow and elevated troponin, or culprit with TIMI III flow and significantly elevated troponin. OMI Absent: No culprit ≥50% stenosis on angiography or, when no angiography, negative serial troponins, no new echo wall‑motion abnormality, and negative clinical follow-up Outcome: Diagnostic accuracy of ECG-based CLA decisions. CLA‑positive was defined a priori for STEMI/STEMI‑equivalents and for “reperfused OMI” (Wellens, transient STEMI). Type of Study: A cross-sectional diagnostic accuracy study using a fixed case‑set, with comparisons to a reference standard. Authors’ Conclusions: “Physicians frequently misinterpret STEMI-equivalent and STEMI-mimic ECGs, potentially impacting CLA decisions. QoH AI demonstrated superior accuracy, suggesting a potential to reduce missed OMIs and unnecessary catheterization laboratory activations. Prospective studies are needed to validate these findings in clinical practice.” Quality Checklist for a Diagnostic Study: The clinical problem is well-defined. Yes The study population represents the target population that would normally be tested for the condition (ie no spectrum bias). No The study population included or focused on those in the ED. No The study participants were recruited consecutively (i.e. no selection bias). No The diagnostic evaluation was sufficiently comprehensive and applied equally to all patients (i.e. no evidence of verification bias). No All diagnostic criteria were explicit, valid and reproducible (i.e. no incorporation bias). Unsure The reference standard was appropriate (i.e. no imperfect gold-standard bias). Yes/No All undiagnosed patients underwent sufficiently long and comprehensive follow-up (i.e. no double gold-standard bias). No The likelihood ratio(s) of the test(s) in question are presented or can be calculated from the information provided. Yes The precision of the measure of diagnostic performance is satisfactory. Reasonable Funding and Conflicts of Interest. No external funding. Several authors report stock ownership/consulting with Powerful Medical (QoH developer), and other authors reported no conflicts. Results: They recruited 95 physicians to interpret the ECGs. There were 53 EM physicians and 42 cardiologists (23 general, 15 interventional, 4 EP electrophysiology). Experience: EPs 7 years (IQR 3 to 15) vs cardiologists 15 years (IQR 9.2 to 21). Key Result: QoH AI had significantly higher accuracy than humans, and there was no significant difference between EM and cardiologists. Primary Outcome: EM Physicians 65.6% (95% CI ~51 to 78) Cardiologists 65.5% (95% CI ~51 to 77) QoH AI 88.9% (95% CI 82 to 93) The most frequently misclassified by humans were LBBB (±OMI), transient STEMI, HATW‑OMI, and de Winter. QoH AI missed LBBB‑OMI and LV aneurysm. RBBB + fascicular block and HATW‑OMI produced the largest EP-cardiologist disagreement. 1) Spectrum Bias: The investigators intentionally selected “ambiguous” STEMI‑equivalent and STEMI‑mimic ECGs and fixed the OMI prevalence at 50% for the reader study. That design improves efficiency in comparing readers and the AI, but it does not reflect the spectrum or prevalence we see in day-to-day ED practice and therefore threatens external validity. In diagnostic accuracy research, spectrum bias occurs when the distribution of disease/non-disease, disease severity, or look-alikes in the sample differs from that in the clinical population in which the test will be used. It can change sensitivity and specificity in either direction. Selecting borderline cases may deflate both compared with routine practice, and it will certainly distort PPV/NPV because predictive values are prevalence‑dependent. The authors acknowledge this by noting the 50% OMI prevalence and the deliberate use of ambiguous ECGs “may not accurately reflect predictive values observed in real-world settings.” 2) Differential Verification & Imperfect Gold Standard: Not every patient had the same reference standard. While most OMI determinations used angiography, some mimic cases without angiography were adjudicated by serial troponins, echocardiography, and clinical follow-up. Using different reference standards in different subgroups constitutes differential verification (double gold‑standard) bias and can bias sensitivity and specificity up or down, depending on whether the disease can resolve or only become detectable over time. In addition, any composite or clinical adjudication process is an imperfect gold standard, which can either inflate or deflate the index test’s performance depending on how errors correlate across tests. The authors explicitly note these issues in their discussion. 3) Incorporation/Review Bias: The paper reports that cardiologists performing angiography were not masked to the ECG. When the result of (or information from) the index test helps determine the reference diagnosis, that is incorporation (review) bias. This typically inflates both sensitivity and specificity of the index test because the gold standard classification is partially “contaminated” by the test under study. In this context, seeing a concerning ECG may tilt the invasive assessment and adjudication toward “culprit” lesion labelling or influence borderline calls, making ECG-based classification look better than it truly is. 4) Unit‑of‑analysis & Precision Limitations: This was a reader study with 95 clinicians classifying the same small set of 18 ECGs. Even with appropriate statistics, the small number of cases means performance estimates can be fragile, and the 95% confidence intervals reflect that imprecision. To their credit, the authors modelled accuracy with multi-level robust variance to account for clustering (multiple readers rating the same cases), but the design still limits precision and generalizability across the full morphology spectrum of each category. The authors themselves state that “one representative ECG per type…cannot represent all ST‑T variants”, and that asking physicians to read far more than 18 tracings was impractical. This imprecision concerns should raise our skeptical radar, and we should factor this into our study interpretation. 5) External Validity: The study is single-center and uses an online survey without the interruptions, time pressure, serial ECGs,

Jan 31, 2026 • 53min

SGEM Xtra: Machines – Or Back to Human

Date: January 6, 2026 Guest Skeptic: Darren McKee is an author and speaker. He has served as a senior policy advisor and policy analyst for over 17 years. Darren hosts the international award-winning podcast, The Reality Check. He is also the author of an excellent, thought-provoking book called Uncontrollable: The Threat of Artificial Superintelligence and the Race to Save the World (2023). The book lays out what AI is, why advanced systems could pose real risks, and what individuals and institutions can do to increase AI safety. We have discussed AI on the SGEM a few times: SGEM Xtra: Rock, Robot Rock – AI for Clinical Research SGEM#459: Domo Arigato Misuta Roboto – Using AI to Assess the Quality of the Medical Literature SGEM#460: Why Do I Feel Like, Somebody’s Watching Me – CHARTWatch to Predict Clinical Deterioration SGEM#472: Together In Electric Dreams – Or Is It Reality? AI already touches the emergency medicine world through triage, documentation (AI scribes), imaging, and patient communications. You argue in the book that we’re in exponential times, AI capabilities may accelerate, and that simple rules won’t reliably constrain advanced systems. All of which has implications for safety, bias, reliability, and public trust in healthcare. The book is divided into three sections. I expanded on that so I could ask Daren questions about five different areas. Listen to the SGEM Xtra podcast to hear his responses: Five Questions for Darren Origin Story & Stakes: The book's introduction contrasts the confident historical skepticism about nuclear power with the speed with which reality overtook it. Give us a brief history of nuclear power. Then the book pivots to today’s AI and uses an analogy of humanity’s "smoke detector " moment. Explain what that is and why you decided now was the time to write this book. Part I: What is Happening? In the first part of the book, you build a narrative from AI to AGI to ASuperI. Can you provide some definitions of those terms and explain why they matter? Can you walk us through how current systems (large language models and image models) work at a high level? Why did emergent capabilities surprise even their builders, and why don’t we fully understand what’s happening under the hood of these machines? Part II: What are the Problems? You outline six core challenges: exponential progress, uncertain timelines (and expert disagreement), the alignment problem, why simple rules (à la “Three Laws”) fail, how control erodes as tech integrates into our lives, and how all this aggregates into societal risk. We are not going to go through all six, but could you explain the alignment problem? The other topic I wanted to expand on was the Three Laws. Part III: What Can We Do? The last two chapters get practical and discuss what institutions can do for safe AI innovation and what individuals can do to increase AI safety. Give us your top 2 or 3 institutional moves (transparency, evaluation, guardrails). How about your top 2 to 3 personal moves that listeners can do? AI in the Emergency Department: Bring it home for us in the emergency department if you can. When an AI-enabled tool is proposed for triage, documentation, or image support, what are the three questions every emergency clinician or leader should ask before adoption? The SGEM will be back next episode with a structured critical appraisal of a recent publication. Our goal is to reduce the knowledge translation (KT) window from over 10 years to less than 1 year using the power of social media. So, patients get the best care, based on the best evidence. Remember to be skeptical of anything you learn, even if you heard it on the Skeptics’ Guide to Emergency Medicine.

Jan 24, 2026 • 51min

SGEM#501: Here it Goes Again – Another Clinical Decision Rule for Febrile Infants 61-90 Days

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

The Skeptics Guide to Emergency Medicine

Episodes

Mentioned books

SGEM Xtra: This One Goes to 11 – ATLS 11th Edition

SGEM#506: Aww I’m Itchy…and I need a Second Generation Antihistamine

SGEM#505: Close Enough for (ARF) Acute Respiratory Failure (HFNO vs NIV)

SGEM Xtra: It’s My Life – DPhil in Oxford

SGEM Xtra: You say you want a revolution – well you know – Against the Grain: Defiant Giants Who Changed the World

SGEM#504: Home Where I Wanted to Go After Anaphylaxis

SGEM#503: Waiting is the Hardest Part – Factors Associated with ED LOS

SGEM#502: Playing with the Queen of Hearts – AI, Is It Very Smart (for ECG Interpretation)?

SGEM Xtra: Machines – Or Back to Human

SGEM#501: Here it Goes Again – Another Clinical Decision Rule for Febrile Infants 61-90 Days

The AI-powered Podcast Player