

Into AI Safety
Jacob Haimes
The Into AI Safety podcast aims to make it easier for everyone, regardless of background, to get meaningfully involved with the conversations surrounding the rules and regulations which should govern the research, development, deployment, and use of the technologies encompassed by the term "artificial intelligence" or "AI"
For better formatted show notes, additional resources, and more, go to https://kairos.fm/intoaisafety/
For better formatted show notes, additional resources, and more, go to https://kairos.fm/intoaisafety/
Episodes
Mentioned books

Mar 10, 2026 • 1h 11min
Thinking Through "Digital Minds" w/ Jacy Reese-Anthis
Jacy Reese-Anthis, founder of Sentience Institute and researcher at Stanford, began his journey working for animal welfare, but is now finishing up his PhD with research in many different AI subfields at the intersection of neuroscience, philosophy, social science, and machine learning. While this may seem like an odd jump at first, Jacy shares how his work has all been centered around the idea of moral circle expansion. In this episode, we dig into what sentience actually means (or at least how we can begin to think about it), why anthropomorphization is more complicated than it sounds, and how language models may be able to be leveraged as an effective tool for social science research.Jacy also shares his median AGI estimate somewhere in there, so stay tuned if you want to catch it.As part of my effort to make this whole podcasting thing more sustainable, I have created a Kairos.fm Patreon which includes an extended version of this episode. Supporting gets you access to these extended cuts, as well as other perks in development.Chapters(00:00) - Introduction
(05:41) - From Animal Welfare to Digital Minds
(09:00) - Founding Sentience Institute
(22:00) - Defining Sentience
(27:13) - The Anthropomorphization Problem
(47:51) - Why "Digital Minds" (Not "Artificial Intelligence")
(51:05) - LLMs as Social Science Tools
(01:07:03) - Jacy’s AGI Timeline & The Singularity
(01:09:23) - Final Thoughts & Outro
Critical LinksBelow are the most important links for this episode. For more, visit the episode page on Kairos.fm.Jacy's websiteWikipedia article - Jacy Reese AnthisSentience Institute websiteCHI paper - Digital Companionship: Overlapping Uses of AI Companions and AI AssistantsICML paper - LLM Social Simulations Are a Promising Research MethodACL paper - The Impossibility of Fair LLMsWikipedia article - ELIZA effectThe Atlantic article - How a Google Employee Fell for the Eliza Effect

12 snips
Feb 2, 2026 • 1h 32min
Scaling AI Safety Through Mentorship w/ Dr. Ryan Kidd
Dr. Ryan Kidd, co-executive director of MATS and former physics PhD, shares how he scaled a premier AI safety talent pipeline. He defines the high-impact "amplifier" archetype and why it is under-served. He explains rigorous mentor selection, balancing funder priorities with research independence, and the value of geographic hubs plus remote access. Practical field-building strategies and program design are front and center.

8 snips
Dec 29, 2025 • 1h 14min
Sobering Up on AI Progress w/ Dr. Sean McGregor
Dr. Sean McGregor, a machine learning safety researcher and founder of several initiatives like the AI Incident Database, delves into the complexities of AI evaluation. He critiques the flaws in current benchmarking practices, emphasizing their vulnerability to training-data leakage and real-world misalignment. Sean introduces BenchRisk, a new framework aimed at improving benchmark trustworthiness. He also discusses the founding of AVERI, a nonprofit focused on frontier model auditing to ensure responsible AI deployment and navigate the tension between market and regulatory safety.

8 snips
Nov 24, 2025 • 1h 9min
Against 'The Singularity' w/ Dr. David Thorstad
Dr. David Thorstad, a philosopher and assistant professor at Vanderbilt University, critiques the singularity hypothesis and its implications for AI safety funding. He argues that the idea of recursive self-improvement leading to superintelligence is fundamentally flawed. Instead of chasing speculative futures, Thorstad advocates for prioritizing immediate issues like poverty, disease, and authoritarianism. He warns that misallocated funds could detract from addressing pressing global problems, and he emphasizes the need for rigorous, peer-reviewed critiques in the field.

Oct 20, 2025 • 1h 12min
Getting Agentic w/ Alistair Lowe-Norris
Alistair Lowe-Norris, Chief Responsible AI Officer at Iridius, dives into the practical side of building safe AI systems. He addresses the crucial need for compliance standards and the potential of procurement practices to ensure responsible AI adoption. Alistair highlights gaps between company promises and actual safety measures, discussing models like robot avatars and the risks associated with AI expansion. He also emphasizes the importance of transparency and continuous oversight to maintain safety in AI practices.

Sep 15, 2025 • 1h 8min
Growing BlueDot's Impact w/ Li-Lian Ang
I'm joined by my good friend, Li-Lian Ang, first hire and product manager at BlueDot Impact. We discuss how BlueDot has evolved from their original course offerings to a new "defense-in-depth" approach, which focuses on three core threat models: reduced oversight in high risk scenarios (e.g. accelerated warfare), catastrophic terrorism (e.g. rogue actors with bioweapons), and the concentration of wealth and power (e.g. supercharged surveillance states). On top of that, we cover how BlueDot's strategies account for and reduce the negative impacts of common issues in AI safety, including exclusionary tendencies, elitism, and echo chambers.2025.09.15: Learn more about how to make design effective interventions to make AI go well and potentially even get funded for it on BlueDot Impact's AGI Strategy course! BlueDot is also hiring, so if you think you’d be a good fit, I definitely recommend applying; I had a great experience when I contracted as a course facilitator. If you do end up applying, let them know you found out about the opportunity from the podcast!Follow Li-Lian on LinkedIn, and look at more of her work on her blog!As part of my effort to make this whole podcasting thing more sustainable, I have created a Kairos.fm Patreon which includes an extended version of this episode. Supporting gets you access to these extended cuts, as well as other perks in development.(03:23) - Meeting Through the Course
(05:46) - Eating Your Own Dog Food
(13:13) - Impact Acceleration
(22:13) - Breaking Out of the AI Safety Mold
(26:06) - Bluedot’s Risk Framework
(41:38) - Dangers of "Frontier" Models
(54:06) - The Need for AI Safety Advocates
(01:00:11) - Hot Takes and Pet Peeves
LinksBlueDot Impact websiteDefense-in-DepthBlueDot Impact blogpost - Our vision for comprehensive AI safety trainingEngineering for Humans blogpost - The Swiss cheese model: Designing to reduce catastrophic lossesOpen Journal of Safety Science and Technology article - The Evolution of Defense in Depth Approach: A Cross Sectorial AnalysisX-clusion and X-riskNature article - AI Safety for EveryoneBen Kuhn blogpost - On being welcomingReflective Altruism blogpost - Belonging (Part 1: That Bostrom email)AIxBioRAND report - The Operational Risks of AI in Large-Scale Biological AttacksOpenAI "publication" (press release) - Building an early warning system for LLM-aided biological threat creationAnthropic Frontier AI Red Team blogpost - Why do we take LLMs seriously as a potential source of biorisk?Kevin Esvelt preprint - Foundation models may exhibit staged progression in novel CBRN threat disclosureAnthropic press release - Activating AI Safety Level 3 protectionsPersuasive AIPreprint - Lies, Damned Lies, and Distributional Language Statistics: Persuasion and Deception with Large Language ModelsNature Human Behavior article - On the conversational persuasiveness of GPT-4Preprint - Large Language Models Are More Persuasive Than Incentivized Human PersuadersAI, Anthropomorphization, and Mental HealthWestern News article - Expert insight: Humanlike chatbots detract from developing AI for the human goodAI & Society article - Anthropomorphization and beyond: conceptualizing humanwashing of AI-enabled machinesArtificial Ignorance article - The Chatbot TrapMaking Noise and Hearing Things blogpost - Large language models cannot replace mental health professionalsIdealogo blogpost - 4 reasons not to turn ChatGPT into your therapistJournal of Medical Society Editorial - Importance of informed consent in medical practiceIndian Journal of Medical Research article - Consent in psychiatry - concept, application & implicationsMedia Naama article - The Risk of Humanising AI Chabots: Why ChatGPT Mimicking Feelings Can BackfireBecker's Behavioral Health blogpost - OpenAI’s mental health roadmap: 5 things to knowMiscellaneous ReferencesCarnegie Council blogpost - What Do We Mean When We Talk About "AI Democratization"?Collective Intelligence Project policy brief - Four Approaches to Democratizing AIBlueDot Impact blogpost - How Does AI Learn? A Beginner's Guide with ExamplesBlueDot Impact blogpost - AI safety needs more public-facing advocacyMore Li-Lian LinksHumans of Minerva podcast websiteLi-Lian's book - Purple is the Noblest ShroudRelevant Podcasts from Kairos.fmScaling Democracy w/ Dr. Igor Krawczuk for AI safety exclusion and echo chambersGetting into PauseAI w/ Will Petillo for AI in warfare and exclusion in AI safety

Aug 4, 2025 • 1h 40min
Layoffs to Leadership w/ Andres Sepulveda Morales
Andres Sepulveda Morales joins me to discuss his journey from three tech layoffs to founding Red Mage Creative and leading the Fort Collins chapter of the Rocky Mountain AI Interest Group (RMAIIG). We explore the current tech job market, AI anxiety in nonprofits, dark patterns in AI systems, and building inclusive tech communities that welcome diverse perspectives.Reach out to Andres on his LinkedIn, or check out the Red Mage Creative website!For any listeners in Colorado, consider attending an RMAIIG event: Boulder; Fort Collins(00:00) - Intro
(01:04) - Andres' Journey
(05:15) - Tech Layoff Cycle
(26:12) - Why AI?
(30:58) - What is Red Mage?
(36:12) - AI as a Tool
(41:55) - AInxiety
(47:26) - Dark Patterns and Critical Perspectives
(01:01:35) - RMAIIG
(01:10:09) - Inclusive Tech Education
(01:18:05) - Colorado AI Governance
(01:23:46) - Building Your Own Tech Community
LinksTech Job MarketLayoff tracker websiteThe Big Newsletter article - Why Are We Pretending AI Is Going to Take All the Jobs?METR preprint - Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer ProductivityAI Business blogpost - https://aibusiness.com/responsible-ai/debunking-the-ai-job-crisisCrunchbase article - Data: Tech Layoffs Remain Stubbornly High, With Big Tech Leading The WayComputerworld article - Tech layoffs surge even as US unemployment remains stableApollo Technical blogpost - Ghost jobs in tech: Why companies are posting roles they don’t plan to fillThe HR Digest article - The Rise of Ghost Jobs Is Leaving Job Seekers Frustrated and DisappointedA Life After Layoff video - The Tech Job Market Is Hot Trash Right NowEconomy Media video - Will The Tech Job Market Ever Recover?Soleyman Shahir video - Tech CEO Explains: The Real Reason Behind AI LayoffsDark PatternsDeceptive Design websiteJournal of Legal Analysis article - Shining a Light on Dark PatternsICLR paper - DarkBench: Benchmarking Dark Patterns in Large Language ModelsComputing Within Limits paper - Imposing AI: Deceptive design patterns against sustainabilityCommunications of the ACM blogpost - Dark Patterns[Preprint] - A Comprehensive Study on Dark PatternsColorado AI RegulationSenate Bill 24-205 (Colorado AI Act) bill and webpageNAAG article - A Deep Dive into Colorado’s Artificial Intelligence ActColorado Sun article - Why Colorado’s artificial intelligence law is a big deal for the whole countryCFO Dive blogpost - ‘Heavy lift’: Colorado AI law sets high bar, analysts sayDenver 7 article - Colorado could lose federal funding as Trump administration targets AI regulationsAmerica's AI Action Plan documentOther SourcesConcordia Framework report and repo80,000 Hours websiteAI Incident Database website

Jun 23, 2025 • 1h 48min
Getting Into PauseAI w/ Will Petillo
Will Petillo, onboarding team lead at PauseAI, joins me to discuss the grassroots movement advocating for a pause on frontier AI model development. We explore PauseAI's strategy, talk about common misconceptions Will hears, and dig into how diverse perspectives still converge on the need to slow down AI development.Will's LinksPersonal blog on AIHis mindmap of the AI x-risk debateGame demosAI focused YouTube channel(00:00) - Intro
(03:36) - What is PauseAI
(10:10) - Will Petillo's journey into AI safety advocacy
(21:13) - Understanding PauseAI
(31:35) - Pursuing a pause
(40:06) - Balancing advocacy in a complex world
(45:54) - Why a pause on frontier models?
(54:48) - Diverse perspectives within PauseAI
(59:55) - PauseAI misconceptions
(01:16:40) - Ongoing AI governance efforts (SB1047)
(01:28:52) - The role of incremental progress
(01:35:16) - Safety-washing and corporate responsibility
(01:37:23) - Lessons from environmentalism
(01:41:59) - Will's superlatives
LinksPauseAIPauseAI-USRelated Kairos.fm EpisodesInto AI Safety episode with Dr. Igor KrawczukmuckrAIkers episode on SB1047Exclusionary TendenciesJacobin article - Elite Universities Gave Us Effective Altruism, the Dumbest Idea of the CenturySSIR article - The Elitist Philanthropy of So-Called Effective AltruismPersuasion blogpost - The Problem with Effective AltruismDark Markets blogpost - What's So Bad About Rationalism?FEE blogpost - What’s Wrong With the Rationality Community?AI in WarfareMaster's Thesis - The Evolution of Artificial Intelligence and Expert Computer Systems in the ArmyInternational Journal of Intelligent Systems article - Artificial Intelligence in the Military: An Overview of the Capabilities, Applications, and ChallengesPreprint - Basic Research, Lethal Effects: Military AI Research Funding as EnlistmentAOAV Article - ‘Military Age Males’ in US Drone StrikesThe Conversation article - Gaza war: Israel using AI to identify human targets raising fears that innocents are being caught in the net972 article - ‘Lavender’: The AI machine directing Israel’s bombing spree in GazaIDF press release - The IDF's Use of Data Technologies in Intelligence ProcessingLieber Institute West Point article - Israel–Hamas 2024 SymposiumVerfassungsblog article - Gaza, Artificial Intelligence, and Kill ListsRAND research report - Dr. Li Bicheng, or How China Learned to Stop Worrying and Love Social Media ManipulationThe Intercept article collection - The Drone PapersAFIT faculty publication - On Large Language Models in National Security ApplicationsNature article - Death by Metadata: The Bioinformationalisation of Life and the Transliteration of Algorithms to FleshLegislationLegiScan page on SB1047NY State Senate page on the RAISE ActCongress page on the TAKE IT DOWN ActThe GavernorFastCompany article - Big Tech may be focusing its lobbying push on the California AI safety bill’s last stop: Gavin NewsomPOLITICO article - How California politics killed a nationally important AI billNewsom's veto messageAdditional relevant lobbying documentation - [1], [2]Jacobin article - With Newsom’s Veto, Big Tech Beats DemocracyMisc. LinksFLI Open Letter on an AI pauseWikipedia article - Overton windowDaniel Smachtenburger YouTube video - An Introduction to the MetacrisisVAISU website (looks broken as of 2025.06.19)AI Impacts report - Why Did Environmentalism Become Partisan?

May 19, 2025 • 1h 33min
Making Your Voice Heard w/ Tristan & Felix de Simone
I am joined by Tristan Williams and Felix de Simone to discuss their work on the potential of constituent communication, specifically in the context of AI legislation. These two worked as part of an AI Safety Camp team to understand whether or not it would be useful for more people to be sharing their experiences, concerns, and opinions with their government representative (hint, it is).Check out the blogpost on their findings, "Talking to Congress: Can constituents contacting their legislator influence policy?" and the tool they created!(01:53) - Introductions
(04:04) - Starting the project
(13:30) - Project overview
(16:36) - Understanding constituent communication
(28:50) - Literature review
(35:52) - Phase 2
(43:26) - Creating a tool for citizen engagement
(50:16) - Crafting your message
(59:40) - The game of advocacy
(01:15:19) - Difficulties on the project
(01:22:33) - Call to action
(01:32:30) - Outro
LinksLinks to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.AI Safety CampPause AIBlueDot ImpactTIME article - There’s an AI Lobbying Frenzy in Washington. Big Tech Is DominatingCongressional Management Foundation study - Communicating with Congress: Perceptions of Citizen Advocacy on Capitol HillCongressional Management Foundation study - The Future of Citizen Engagement: Rebuilding the Democratic DialogueTristan and Felix's blogpost - Talking to Congress: Can constituents contacting their legislator influence policy?Wired article - What It Takes to Make Congress Actually ListenAmerican Journal of Polical Science article - Congressional Representation: Accountability from the Constituent’s PerspectivePolitical Behavior article - Call Your Legislator: A Field Experimental Study of the Impact of a Constituency Mobilization Campaign on Legislative VotingGuided Track websiteThe ToolHolistic AI global regulatory trackerWhite & Case global regulatory trackerSteptoe US AI legislation trackerManatt US AIxHealth legislation trackerIssue One article - Big Tech Cozies Up to New Administration After Spending Record Sums on Lobbying Last YearVerfassungsblog article - BigTech’s Efforts to Derail the AI ActMIT Technology Review article - OpenAI has upped its lobbying efforts nearly sevenfoldOpen Secrets webpage - Issue Profile: Science & TechnologyStatista data - Leading lobbying spenders in the United States in 2024Global Justice Now report - Democracy at risk in Davos: new report exposes big tech lobbying and political interferenceIpsos article - Where Americans stand on AIAP-NORC report - There Is Bipartisan Concern About the Use of AI in the 2024 ElectionsAI Action Summit report - International AI Safety ReportYouGov article - Do Americans think AI will have a positive or negative impact on society?

Jun 3, 2024 • 2h 59min
INTERVIEW: Scaling Democracy w/ (Dr.) Igor Krawczuk
The almost Dr. Igor Krawczuk joins me for what is the equivalent of 4 of my previous episodes. We get into all the classics: eugenics, capitalism, philosophical toads... Need I say more?If you're interested in connecting with Igor, head on over to his website, or check out placeholder for thesis (it isn't published yet).Because the full show notes have a whopping 115 additional links, I'll highlight some that I think are particularly worthwhile here:The best article you'll ever read on Open Source AIThe best article you'll ever read on emergence in MLKate Crawford's Atlas of AI (Wikipedia)On the Measure of IntelligenceThomas Piketty's Capital in the Twenty-First Century (Wikipedia)Yurii Nesterov's Introductory Lectures on Convex OptimizationChapters(02:32) - Introducing Igor
(10:11) - Aside on EY, LW, EA, etc., a.k.a. lettersoup
(18:30) - Igor on AI alignment
(33:06) - "Open Source" in AI
(41:20) - The story of infinite riches and suffering
(59:11) - On AI threat models
(01:09:25) - Representation in AI
(01:15:00) - Hazard fishing
(01:18:52) - Intelligence and eugenics
(01:34:38) - Emergence
(01:48:19) - Considering externalities
(01:53:33) - The shape of an argument
(02:01:39) - More eugenics
(02:06:09) - I'm convinced, what now?
(02:18:03) - AIxBio (round ??)
(02:29:09) - On open release of models
(02:40:28) - Data and copyright
(02:44:09) - Scientific accessibility and bullshit
(02:53:04) - Igor's point of view
(02:57:20) - Outro
LinksLinks to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. All references, including those only mentioned in the extended version of this episode, are included.Suspicious Machines Methodology, referred to as the "Rotterdam Lighthouse Report" in the episodeLIONS Lab at EPFLThe meme that Igor referencesOn the Hardness of Learning Under SymmetriesCourse on the concept of equivariant deep learningAside on EY/EA/etc.Sources on Eliezer YudkowskiScholarly Community EncyclopediaTIME100 AIYudkowski's personal websiteEY WikipediaA Very Literary Wiki -TIME article: Pausing AI Developments Isn’t Enough. We Need to Shut it All Down documenting EY's ruminations of bombing datacenters; this comes up later in the episode but is included here because it about EY.LessWrongLW WikipediaMIRICoverage on Nick Bostrom (being a racist)The Guardian article: ‘Eugenics on steroids’: the toxic and contested legacy of Oxford’s Future of Humanity InstituteThe Guardian article: Oxford shuts down institute run by Elon Musk-backed philosopherInvestigative piece on Émile TorresOn the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜NY Times article: We Teach A.I. Systems Everything, Including Our BiasesNY Times article: Google Researcher Says She Was Fired Over Paper Highlighting Bias in A.I.Timnit Gebru's WikipediaThe TESCREAL Bundle: Eugenics and the Promise of Utopia through Artificial General IntelligenceSources on the environmental impact of LLMsThe Environmental Impact of LLMsThe Cost of Inference: Running the ModelsEnergy and Policy Considerations for Deep Learning in NLPThe Carbon Impact of AI vs Search EnginesFilling Gaps in Trustworthy Development of AI (Igor is an author on this one)A Computational Turn in Policy Process Studies: Coevolving Network Dynamics of Policy ChangeThe Smoothed Possibility of Social Choice, an intro in social choice theory and how it overlaps with MLRelating to Dan HendrycksNatural Selection Favors AIs over Humans"One easy-to-digest source to highlight what he gets wrong [is] Social and Biopolitical Dimensions of Evolutionary Thinking" -IgorIntroduction to AI Safety, Ethics, and Society, recently published textbook"Source to the section [of this paper] that makes Dan one of my favs from that crowd." -IgorTwitter post referenced in the episode<...


