
The Bioinformatics CRO Podcast Manos Metzakopian - CellCodex and AI-ready datasets
Sep 30, 2025
In this engaging discussion, Manos Metzakopian, Co-founder and CEO of CellCodex, delves into creating AI-ready biological datasets. He highlights the critical need for reproducible perturbation data to advance drug discovery and causal modeling in biology. Manos explains how AI will accelerate decision-making in target selection and disease modeling. He also addresses the importance of single-cell multi-omics and emphasizes the balance between biological plausibility and predictive accuracy in data generation. His insights into fostering collaborations and ensuring strict quality control throughout the process are invaluable.
AI Snips
Chapters
Transcript
Episode notes
Three Core Design Decisions
- First design decisions are: cell models, perturbation types, and readouts balancing scale and utility.
- These choices determine experiment scale and whether datasets suit foundation or task-specific models.
Both Foundation And Task Models Matter
- Both foundation and task-specific models matter and require different dataset breadth and curation.
- Foundation models need vast diverse data; task models need precise context-specific datasets.
Benchmark By Replication And Validation
- Benchmark models by replication and external validation rather than only by predictive metrics.
- Require that perturbation effects replicate and that model outputs validate experimentally.
