Beyond The Pilot: Enterprise AI in Action

Inside LinkedIn’s AI Engineering Playbook

Jan 21, 2026
Erran Berger, VP of Product Engineering at LinkedIn who led distilling large LLMs into ultra-efficient production models. He reveals how LinkedIn distilled 7B models down to 600M students, the multi-teacher split for policy vs. clicks, synthetic GPT-4 golden datasets, and the 10x latency savings from pruning, quantization, and context compression. He also explains the org shift to eval-first product design.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Why Off The Shelf LLMs Failed For Search

  • LinkedIn found off-the-shelf LLMs and prompting couldn't meet their recommender quality or latency needs for search with tens of millions of daily users.
  • Erran Berger says search required fine-tuning and distillation because large models were too compute-intensive and slow for production at LinkedIn scale.
ANECDOTE

From Product Policy To Synthetic Data Cookbook

  • LinkedIn turned a 20–30 page product policy and a small human-labeled golden dataset into a large synthetic dataset using GPT to teach scoring rules.
  • They trained a ~7B teacher on that synthetic set, then distilled further for production.
INSIGHT

Multi Stage Distillation For Efficiency

  • Distillation used staged compression: 7B teacher → 1.7B intermediate → 0.6B student to balance training efficiency and quality.
  • Erran explains intermediate models speed iterative student training while minimizing quality loss.
Get the Snipd Podcast app to discover more snips from this episode
Get the app