The more AI can do, the more we need to ask what it should and shouldn’t do. In this episode, OpenAI researcher Jason Wolfe joins host Andrew Mayne to talk about the Model Spec, the public framework that defines intended model behavior. They discuss how the Model Spec works in practice, including how the chain of command handles conflicts between instructions, and how OpenAI evolves it based on feedback, real-world use, and new model capabilities.

Chapters

00:00 Introduction

01:10 What is the Model Spec?

03:55 How does the Model Spec work in practice?

06:26 Transparency: Where to read the Model Spec & give feedback

07:51 How did the Model Spec originate?

10:02 How does the spec translate into model behavior?

11:26 What is the hierarchy / chain of command?

13:35 Handling edge cases like Santa Claus

17:41 How does the Model Spec evolve over time?

19:59 What happens when models disagree with the spec?

22:05 How do smaller models follow the spec?

23:16 Is chain-of-thought useful for alignment?

24:16 Model Spec vs Anthropic’s Constitution

26:28 What surprised you most?

26:56 How do you define the scope of the spec?

27:44 What is the future of the Model Spec?

31:16 How should developers think about the spec?

34:44 Asimov’s laws vs Model Spec

37:16 Could AI write a Human Spec?

Hosted on Acast. See acast.com/privacy for more information.

Episode 15 - Inside the Model Spec

OpenAI Podcast

Future of the Model Spec

The AI-powered Podcast Player