
Data Engineering Podcast Beyond the PDF: Rowan Cockett on Reproducible, Composable Science
9 snips
Mar 22, 2026 Rowan Cockett, co-founder and CEO of CurveNote and Continuous Science Foundation, builds tools for reproducible, composable scientific research. He talks about fixing PDF-bound workflows, cloud-optimized formats like Zarr, Jupyter-based interactive articles, graceful degradation of interactives, storage partnerships that avoid hosting huge datasets, and the Open Exchange Architecture push for interoperable scientific components.
AI Snips
Chapters
Transcript
Episode notes
Reproducibility Means Integrity And Reuse
- Reproducibility requires both data integrity and practical reuse, not just access to files.
- Rowan Cockett contrasts trusting datasets and hands-on reuse as twin levers that speed scientific progress when combined.
PDFs Break For Large Computational Data
- Modern science often deals with terabytes and complex pipelines while publication formats remain PDF-bound.
- Cockett highlights the mismatch where screenshots replace interactive exploration of large microscopy and geoscience datasets.
Adopt Cloud Optimized Data Formats
- Use open data standards and cloud-optimized formats like Zarr to make large scientific datasets accessible.
- Cockett gives the example of zooming into 1.5 TB of microscopy data like Google Maps by storing tiled data in cloud buckets with metadata.
