Episode 34: The AI Revolution Will Not Be Monopolized

Aug 22, 2024

Guests Ines Montani and Matthew Honnibal, founders of Explosion AI and creators of the widely-used spaCy library, discuss the evolution of natural language processing (NLP) in industry. They share insights on balancing large and small AI models, challenges in modularity and privacy, and the impact of regulation on innovation. Their transition to a smaller company highlights lessons learned in the AI startup world. The conversation touches on the importance of data quality and open-source tools while celebrating the practical applications of AI for data scientists and enthusiasts alike.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ADVICE

Use LLMs For Development, Distill For Runtime

Use large generative models in development only and distill to smaller models for runtime to save cost and protect privacy.
Treat models as components you can replace rather than a single runtime dependency for everything.

INSIGHT

Focus On Non–AI-Complete Problems

Many practical NLP problems are not 'AI-complete' and can be solved using surface linguistic structure.
Framing tasks to avoid AI-complete requirements lets you build accurate, small CPU-friendly models.

ADVICE

Few Hundred Examples Often Suffice

You often only need a few hundred labeled examples to beat zero/few-shot LLM classifiers on many tasks.
Prioritize small, focused data collection and stable evaluation over chasing larger models.

Get the Snipd Podcast app to discover more snips from this episode

Get the app