Super Data Science: ML & AI Podcast with Jon Krohn

826: In Case You Missed It in September 2024

Oct 11, 2024
Julia Silge, Engineering Manager at Posit, shares insights on the development of Positron, an IDE designed specifically for data scientists' unique coding needs. Luca Anichin offers tips on enhancing machine learning models in PyTorch, stressing the balance between model and data. Marco Garelli discusses Polars, an open-source library that significantly speeds up data manipulation compared to Pandas. Mark Weissman highlights essential traits for data scientist hiring, advocating for practical skills over traditional qualifications.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ADVICE

Model Optimization

  • Optimize models by considering both model-centric and data-centric approaches.
  • Don't just focus on model adjustments; evaluate and improve data quality.
ADVICE

Model Complexity

  • Start with the simplest model and build complexity as needed, testing simpler models first.
  • Don't overcomplicate models initially; establish a baseline for comparison.
ANECDOTE

Mislabeled Data

  • Luca's team achieved high accuracy with a custom vision model but hit a ceiling.
  • They found 10% of their data was mislabeled, highlighting the impact of data quality.
Get the Snipd Podcast app to discover more snips from this episode
Get the app