The Information Bottleneck

EP27: Medical Foundation Models - with Tanishq Abraham (Sophont.AI)

Mar 2, 2026
Tanishq Abraham, CEO and co-founder of Sophont.ai, builds multimodal foundation models for pathology, neuroimaging, and clinical text. He discusses training on high-quality public data that rivals massive private sets. Conversation covers finding signals doctors can’t see, fusing strong single-modality encoders into multimodal systems, regulatory paths, and practical near-term impacts like pharma partnerships.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ADVICE

Train Encoders Then Align Latent Spaces

  • Build strong single-modality encoders first, then perform late-fusion alignment to create multimodal medical models.
  • Train each encoder with self-supervised methods and later align latent spaces with modest multimodal data.
ANECDOTE

12K Public Slides Matched Million Slide Models

  • Sophont trained a pathology foundation model on 12,000 public slides and matched performance of models trained on millions of private slides.
  • Tanishq uses this to argue data quality and curation can beat raw private quantity in medical imaging.
INSIGHT

Balance Domain Expertise With Scaling

  • Effective medical models need both domain expertise and compute; pure heuristics from clinicians or pure compute from ML engineers both fail.
  • Sophont positions itself between domain knowledge and scaling to build useful systems.
Get the Snipd Podcast app to discover more snips from this episode
Get the app