How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning

741 snips

Nov 28, 2025

Sherwin Wu, Head of Engineering at OpenAI, shares insights on AI model specialization and fine-tuning. He discusses the shift from monolithic models to tailored systems, emphasizing why developers gravitate towards trusted models. Sherwin explains the evolution of context design over prompt engineering and how OpenAI optimizes its platform for user engagement. He also dives into the intricacies of usage-based pricing, the impact of recent acquisitions, and how OpenAI's new deterministic agent builder enhances workflows across products.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

Why Developers Stay With One Model

Stickiness comes from both user familiarity and deep technical integration by developers.
Startups build product-specific harnesses that optimize behavior for a chosen model.

Fine-Tuning Unlocks Company Data

Fine-tuning exists because companies have rich proprietary data that improve model behavior.
OpenAI built fine-tuning to let customers leverage that data beyond simple prompt tweaks.

Reinforcement Fine-Tuning Is The Big Unlock

Supervised fine-tuning mainly changed style; reinforcement fine-tuning (RFT) enables substantive performance gains.
RFT lets customers push models toward SOTA on specific tasks using their data.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

In this episode, a16z GP Martin Casado sits down with Sherwin Wu, Head of Engineering for the OpenAI Platform, to break down how OpenAI organizes its platform across models, pricing, and infrastructure, and how it is shifting from a single general-purpose model to a portfolio of specialized systems, custom fine-tuning options, and node-based agent workflows.

They get into why developers tend to stick with a trusted model family, what builds that trust, and why the industry moved past the idea of one model that can do everything. Sherwin also explains the evolution from prompt engineering to context design and how companies use OpenAI’s fine-tuning and RFT APIs to shape model behavior with their own data.

Highlights from the conversation include:

• How OpenAI balances a horizontal API platform with vertical products like ChatGPT
• The evolution from Codex to the Composer model
• Why usage-based pricing works and where outcome-based pricing breaks
• What the Harmonic Labs and Rockset acquisitions added to OpenAI’s agent work
• Why the new agent builder is deterministic, node based, and not free roaming

Resources:

Follow Sherwin on X: https://x.com/sherwinwu

Follow Martin on X: https://x.com/martin_casado

Stay Updated:

If you enjoyed this episode, be sure to like, subscribe, and share with your friends!

Find a16z on X: https://x.com/a16z

Find a16z on LinkedIn: https://www.linkedin.com/company/a16z

Listen to the a16z Podcast on Spotify: https://open.spotify.com/show/5bC65RDvs3oxnLyqqvkUYX

Listen to the a16z Podcast on Apple Podcasts: https://podcasts.apple.com/us/podcast/a16z-podcast/id842818711

Follow our host: https://x.com/eriktorenberg

Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see http://a16z.com/disclosures

Stay Updated:

Find a16z on YouTube: YouTube

Find a16z on X

Find a16z on LinkedIn

Listen to the a16z Show on Spotify

Listen to the a16z Show on Apple Podcasts

Follow our host: https://twitter.com/eriktorenberg

Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.