Guardrails: refusal, unlearning, and filtration

Richard reviews refusal classifiers, unlearning, and data filtration, noting limits and evolving compute economics.

Play episode from 01:33:12

chevron_right

Transcript

chevron_right

Transcript

Episode notes

Last September, scientists used an AI model to design genomes for entirely new bacteriophages (viruses that infect bacteria). They then built them in a lab. Many were viable. And despite being entirely novel some even outperformed existing viruses from that family.

That alone is remarkable. But as today's guest — Dr Richard Moulange, one of the world's top experts on 'AI–Biosecurity' — explains, it's just one of many data points showing how AI is dissolving the barriers that have historically kept biological weapons out of reach.

For years, experts have reassured us that 'tacit knowledge' — the hands-on, hard-to-Google lab skills needed to work with dangerous pathogens — would prevent bad actors from weaponising biology. So far, they've been right.

But as of 2025 that reassurance is crumbling. The Virology Capabilities Test measures exactly this kind of troubleshooting expertise, and finds that modern AI models crushed top human virologists even in their self-declared area of greatest specialisation and expertise — 45% to 22%.

Meanwhile, Anthropic’s research shows PhD-level biologists getting meaningfully better at weapons-relevant tasks with AI assistance — with the effect growing with each new model generation.

Richard joins host Rob Wiblin to discuss all that plus:

What AI biology tools already exist
Why mid-tier actors (not amateurs) are the ones getting the most dangerous boost
The three main categories of defence we can pursue
Whether there’s a plausible path to a world where engineered pandemics become a thing of the past

This episode was recorded on January 16, 2026. Since recording this episode, Richard has seconded to the UK Government — please note that his views expressed here are entirely his own.

Links to learn more, video, and full transcript: https://80k.info/rm

Announcements:

Our new book is available to preorder: 80,000 Hours: How to have a fulfilling career that does good is written by our cofounder Benjamin Todd. It’s a completely revised and updated edition of our existing career guide, with a big new updated section on AI — covering both the risks and the potential to steer it in a better direction, and how AI automation should affect your career planning and which skills one chooses to specialise in. Preorder now: https://geni.us/80000Hours
We're hiring contract video editors for the podcast! For more information, check out the expression of interest page on the 80,000 Hours website: https://80k.info/video-editor

Chapters:

Cold open (00:00:00)
Who's Richard Moulange? (00:00:31)
AI can now design novel genomes (00:01:11)
The end of the 'tacit knowledge' barrier (00:04:34)
Are risks from bioterrorists overstated? (00:18:20)
The 3 key disasters AI makes more likely (00:22:41)
Which bad actors does AI help the most? (00:30:03)
Experts are more scary than amateurs (00:41:17)
Barriers to bioterrorists using AI (00:46:43)
AI biorisks are sometimes dismissed (and that's a huge mistake) (00:48:54)
Advanced AI biology tools we already have or will soon (01:04:10)
Rob argues that the situation is hopeless (01:09:49)
Intervention #1: Limit access (01:18:16)
Intervention #2: Get AIs to refuse to help (01:32:58)
Intervention #3: Surveillance and attribution (01:42:38)
Intervention #4: Universal vaccines and antivirals (01:56:38)
Intervention #5: Screen all orders for DNA (02:10:00)
AI companies talk about def/acc more than they fund it (02:19:52)
Can you build a profitable business solving this problem? (02:26:32)
This doesn't have to interfere with useful science (much) (02:30:56)
What are the best low-tech interventions? (02:33:01)
Richard's top request for AI companies (02:37:59)
Grok shows governments lack many legal levers (02:53:17)
Best ways listeners can help fix AI-Bio (02:56:24)
We might end all contagious disease in 20 years (03:03:37)

Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour
Music: CORBIT
Camera operator: Jeremy Chevillotte
Transcripts and web: Elizabeth Cox and Katy Moore

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Home Top podcasts Popular guests Top books