
The Information Bottleneck EP27: Medical Foundation Models - with Tanishq Abraham (Sophont.AI)
Mar 2, 2026
Tanishq Abraham, CEO and co-founder of Sophont.ai, builds multimodal foundation models for pathology, neuroimaging, and clinical text. He discusses training on high-quality public data that rivals massive private sets. Conversation covers finding signals doctors can’t see, fusing strong single-modality encoders into multimodal systems, regulatory paths, and practical near-term impacts like pharma partnerships.
AI Snips
Chapters
Transcript
Episode notes
Train Encoders Then Align Latent Spaces
- Build strong single-modality encoders first, then perform late-fusion alignment to create multimodal medical models.
- Train each encoder with self-supervised methods and later align latent spaces with modest multimodal data.
12K Public Slides Matched Million Slide Models
- Sophont trained a pathology foundation model on 12,000 public slides and matched performance of models trained on millions of private slides.
- Tanishq uses this to argue data quality and curation can beat raw private quantity in medical imaging.
Balance Domain Expertise With Scaling
- Effective medical models need both domain expertise and compute; pure heuristics from clinicians or pure compute from ML engineers both fail.
- Sophont positions itself between domain knowledge and scaling to build useful systems.
