
267: Achieving Data Science Maturity
Super Data Science: ML & AI Podcast with Jon Krohn
00:00
How to Track Variations in Python Models
GitHub is great for code. But once you start trying to track the data, you need to augment the get like versioning system with data. If you're doing a hyperparameter search over say 20 combinations of hyperparameters, they're going to be in the same file that's checked in to get. So that's where you need a way to extract the pieces from your code that lead to variations in your model and track them separately. And the final one is also the environment, particularly because a lot of data sciences in Python, were you using pandas version ABC or, you know, D F and that will end up making a big difference on whether you can reproduce
Play episode from 17:56
Transcript


