Data Engineering Podcast

Beyond the PDF: Rowan Cockett on Reproducible, Composable Science

9 snips
Mar 22, 2026
Rowan Cockett, co-founder and CEO of CurveNote and Continuous Science Foundation, builds tools for reproducible, composable scientific research. He talks about fixing PDF-bound workflows, cloud-optimized formats like Zarr, Jupyter-based interactive articles, graceful degradation of interactives, storage partnerships that avoid hosting huge datasets, and the Open Exchange Architecture push for interoperable scientific components.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Reproducibility Means Integrity And Reuse

  • Reproducibility requires both data integrity and practical reuse, not just access to files.
  • Rowan Cockett contrasts trusting datasets and hands-on reuse as twin levers that speed scientific progress when combined.
INSIGHT

PDFs Break For Large Computational Data

  • Modern science often deals with terabytes and complex pipelines while publication formats remain PDF-bound.
  • Cockett highlights the mismatch where screenshots replace interactive exploration of large microscopy and geoscience datasets.
ADVICE

Adopt Cloud Optimized Data Formats

  • Use open data standards and cloud-optimized formats like Zarr to make large scientific datasets accessible.
  • Cockett gives the example of zooming into 1.5 TB of microscopy data like Google Maps by storing tiled data in cloud buckets with metadata.
Get the Snipd Podcast app to discover more snips from this episode
Get the app