Custom Evals for Small Models and RAG Benchmarks

Maxime details building narrow internal benchmarks, repurposing frontier evaluations, and designing tests focused on function calling and web search.

Play episode from 38:37

chevron_right

Transcript

chevron_right

Transcript

Episode notes

The transformer architecture has dominated AI since 2017, but it’s not the only approach to building LLMs - and new architectures are bringing LLMs to edge devices

Maxime Labonne, Head of Post-Training at Liquid AI and creator of the 67,000+ star LLM Course, joins Conor Bronsdon to challenge the AI architecture status quo. Liquid AI’s hybrid architecture, combining transformers with convolutional layers, delivers faster inference, lower latency, and dramatically smaller footprints without sacrificing capability.

This alternative architectural philosophy creates models that run effectively on phones and laptops without compromise.

But reimagined architecture is only half the story. Maxime unpacks the post-training reality most teams struggle with: challenges and opportunities of synthetic data, how to balance helpfulness against safety, Liquid AI’s approach to evals, RAG architectural approaches, how he sees AI on edge devices evolving, hard won lessons from shipping LFM1 through 2, and much more.

If you're tired of surface-level AI takes and want to understand the architectural and engineering decisions behind production LLMs from someone building them in the trenches, this is your episode.

Connect with ⁨Maxime Labonne⁩ :

LinkedIn – https://www.linkedin.com/in/maxime-labonne/

X (Twitter) – @maximelabonne

About Maxime – https://mlabonne.github.io/blog/about.html

HuggingFace – https://huggingface.co/mlabonne

The LLM Course – https://github.com/mlabonne/llm-course

Liquid AI – https://liquid.ai

Connect with ⁨Conor Bronsdon⁩ :

X (twitter) – @conorbronsdon

Substack – https://conorbronsdon.substack.com/

LinkedIn – https://www.linkedin.com/in/conorbronsdon/

00:00 Intro — Welcome to Chain of Thought

00:27 Guest Intro — Maxime Labonne of Liquid AI

02:21 The Hybrid LLM Architecture Explained

06:30 Why Bigger Models Aren’t Always Better

11:10 Convolution + Transformers: A New Approach to Efficiency

18:00 Running LLMs on Laptops and Wearables

22:20 Post-Training as the Real Moat

25:45 Synthetic Data and Reliability in Model Refinement

32:30 Evaluating AI in the Real World

38:11 Benchmarks vs Functional Evals

43:05 The Future of Edge-Native Intelligence

48:10 Closing Thoughts & Where to Find Maxime Online

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Home Top podcasts Popular guests Top books