MLOps.community

Demetrios
undefined
Jul 29, 2022 • 52min

Just Fetch the Data and then... // David Bayliss // Coffee Sessions #110

MLOps Coffee Sessions #110 with David Bayliss, Chief Data Scientist of LexisNexis Risk Solutions, Just Fetch the Data and then... co-hosted by Vishnu Rachakonda.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠⁠⁠⁠⁠⁠⁠⁠⁠// AbstractComposing data to extract features can be a significant problem. Key factors are the data size, compliance restrictions, and real-time data. Ethics (and law) can drive extremely complex audit requirements. In the cloud, you can do anything - at a price.// BioOne of the creators of the world's first big data platform (HPCC), David has been tackling big data problems for two decades. A mathematician, compiler writer, and data sponge with more than five dozen patents spanning platforms, linking, and search.Most inventors think outside the box; David can't even remember where the box is. He leads the team that creates their core Data Science methods used by hundreds of data scientists.// MLOps Jobs board  MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related LinksInteresting insight in this post. It would be cool to learn from David about his view on thingshttps://www.google.com/url?q=https://www.linkedin.com/posts/david-bayliss-426556a_datascience-platform-portability-activity-6913448643303759872-2dqq?utm_source%3Dlinkedin_share%26utm_medium%3Dmember_desktop_web&sa=D&source=calendar&ust=1649078059106132&usg=AOvVaw26wAevExeEfW_AdZSA8UhF--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/Connect with David on LinkedIn: https://www.linkedin.com/in/david-bayliss-426556a/Timestamps: [00:00] Introduction to David Bayliss [01:03] Takeaways [04:56] LexisNexis and David's role [07:15] Evolution of LexisNexis in 20 years with so many use cases [08:51] Role of David in structuring data for working with data change [14:32] Data management and data access [17:45] Unique challenges of scale, use case, and diversity at LexisNexis [24:47] Tardis Iron Box [30:05] Iron Box translation [32:56] JVM for data science [34:24] Iron Box meaning [36:52] Metadata with PII [39:08] Detrimental privacy / Hairy Kneecap Theory [40:57] Speeding things up and Anonymized linking [46:47] What kept David working at LexisNexis? [50:30] Wrap up
undefined
Jul 23, 2022 • 1h 12min

Why You Need More Than Airflow // Ketan Umare // Coffee Sessions #109

Ketan Umare, Co-founder and CEO of Union.ai, shares insights from his extensive experience at Lyft, Oracle, and Amazon. He discusses the limitations of Airflow in machine learning, emphasizing the need for ML-specific orchestration tools. The conversation covers the complexities of data pipelines, the importance of effective feature management, and the challenges of model drift. Ketan also highlights cloud-native solutions, security in modern engineering, and innovative programming collaborations, all while offering book recommendations that tie historical lessons to today's tech landscape.
undefined
15 snips
Jul 19, 2022 • 1h 6min

ML Flow vs Kubeflow 2022 // Byron Allen // Coffee Sessions #108

MLOps Coffee Sessions #108 with Byron Allen, AI & ML Practice Lead at Contino, ML Flow vs Kubeflow 2022 co-hosted by George Pearse.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠⁠⁠⁠⁠⁠⁠// AbstractThe amazing Byron Allen talks to us about why MLflow and Kubeflow are not playing the same game!  ML flow vs Kubeflow is more like comparing apples to oranges, or as he likes to make the analogy, they are both cheese, but one is an all-rounder and the other a high-class delicacy. This can be quite deceiving when analyzing the two. We do a deep dive into the functionalities of both and the pros/cons they have to offer.// BioByron wears several hats. AI & ML practice lead, solutions architect, ML engineer, data engineer, data scientist, Google Cloud Authorized Trainer, and scrum master. He has a track record of successfully advising on and delivering data science platforms and projects. Byron has a mix of technical capability, business acumen, and communication skills that make me an effective leader, team player, and technology advocate.   See Byron write at https://medium.com/@byron.allen// MLOps Jobs board  jobs.mlops.communityMLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related Links--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with George on LinkedIn: https://www.linkedin.com/in/george-pearse-b7a76a157/?originalSubdomain=ukConnect with Byron on LinkedIn: https://www.linkedin.com/in/byronaallen/Timestamps: [00:00] Introduction to Byron Allen [01:10] Introduction to the new co-host, George Pearse [01:41] ML Flow vs Kubeflow [05:40] George's take on ML Flow and Kubeflow [07:28] Writing in YAML [09:47] Developer experience [13:38] Changes in ML Flow and Kubeflow [17:58] Messing around ML Flow Serving [20:00] A taste of Kubeflow through K-Serve [23:18] Managed service of Kubeflow [25:15] How George used Kubeflow [27:45] Getting the Managed Service [31:30] Getting Authentication [32:41] ML Flow docs vs Kubeflow docs [36:59] Kubeflow community incentives [42:25] MLOps Search term [42:52] Organizational problem [43:50] Final thoughts on ML Flow and Kubeflow [49:19] Bonus [49:35] Entity-Centric Modeling [52:11] Semantic Layer options [57:27] Semantic Layer with Machine Learning [58:40] Satellite Infra Images demo [1:00:49] Motivation to move away from SQL [1:03:00] Managing SQL [1:05:24] Wrap up
undefined
10 snips
Jul 11, 2022 • 59min

Why and When to Use Kubeflow for MLOps // Ryan Russon // Coffee Sessions #107

MLOps Coffee Sessions #107 with Ryan Russon, Manager, MLOps and Data Science of Maven Wave Partners, Why and When to Use Kubeflow for MLOps, co-hosted by Mihail Eric.  Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠⁠⁠⁠⁠⁠// AbstractKubeflow is an excellent platform if your team is already leveraging Kubernetes, and it allows for a truly collaborative experience.Let’s take a deep dive into the pros and cons of using Kubeflow in your MLOps.  // BioFrom serving as an officer in the US Navy to Consulting for some of America's largest corporations, Ryan has found his passion in the enablement of Data Science workloads for companies and teams.     Having spent years as a data scientist, Ryan understands the types of challenges that DS teams face in scaling, tracking, and efficiently running their workloads.  // MLOps Jobs board  jobs.mlops.community// MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related Linkshttps://www.mavenwave.com/https://go.mlops.community/hFApDb--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/Connect with Ryan on LinkedIn: https://www.linkedin.com/in/ryanrusson/Timestamps: [00:00] Introduction to Ryan Russon [01:13] Takeaways [04:17] Bullish on KubeFlow! [06:23] KubeFlow in ML tooling [11:47] Kubeflow is having its velocity [14:16] To Kubeflow or not to Kubeflow [18:25] KubeFlow ecosystem maturity [20:51] Alternatively, starting from scratch? [23:11] Argo workflow vs KubeFlow pipelines [25:08] KubeFlow as an end-state for citizen data scientists [28:24] End-to-end workflow key players  [31:17] K-serve [33:41] KubeFlow on orchestrators [36:24] Natural transition to KubeFlow maturity [41:33] "Don't forget about the engineer cost." [42:21] KubeFlow to other "Flow brothers" trade-offs [46:12] Biggest MLOps challenge [49:52] Best practices around file structure [52:15] KubeFlow changes over the years and what to expect moving forward [55:52] Best-of-breed vision [57:54] Wrap up
undefined
9 snips
Jul 5, 2022 • 54min

Building a Culture of Experimentation to Speed Up Data-Driven Value // Delina Ivanova // MLOps Coffee Sessions #106

MLOps Coffee Sessions #106 with Delina Ivanova, Associate Director, Data of HelloFresh, Building a Culture of Experimentation to Speed Up Data-Driven Value, co-hosted by Vishnu Rachakonda.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠⁠⁠⁠⁠// AbstractSupply chain/manufacturing is are prime area where the use of data science/analytics/ ML is underdeveloped, and experimentation is required to collect data and enable data-driven solutions.This talk encourages companies to conduct experiments and collect data over time in order to build accurate/scalable data-driven solutions.// BioDelina has over 10 years of experience across data and analytics, consulting, and strategy with roles spanning financial services, public sector, and CPG industries. She is currently the Associate Director, Data & Insights at HelloFresh Canada, where she leads a full-service data team, including data engineering, data science, business intelligence, and automation. She is also a Data Science and Machine Learning instructor in the professional development programs at the University of Toronto and the University of Waterloo.// MLOps Jobs board  jobs.mlops.communityMLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related LinksThe Discourses of Epictetus book: https://www.amazon.com/Discourses-Epictetus/dp/1537427180The Pyramid Principle: Logic in Writing and Thinking book by Barbara Minto: https://www.amazon.com/Pyramid-Principle-Logic-Writing-Thinking/dp/0273710516--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/Connect with Delina on LinkedIn: https://www.linkedin.com/in/delina-ivanova/Timestamps:[00:00] Introduction to Delina Ivanova[00:35] Takeaways[03:46] Looking for People to organize local Meetups![04:30] Delina's career trajectories and growth in the corporate schema[10:02] Telling stories with data[13:23] Tricks for being a translator from the business side to data teams[15:32] Technical engineering management and Delina's day-to-day role[20:40] Giving up day-to-day individual contributing work and coding[23:33] Good leadership for technical work[31:05] Growing team growing productivity[32:55] Pressured to grow[35:23] HelloFresh[39:39] Challenges of e-commerce, CPG, Logistics, and grocery combined[41:08] Cultural differences[46:04] Rapid-fire session[52:20] Wrap up
undefined
8 snips
Jul 1, 2022 • 1h 6min

Cleanlab: Labeled Datasets that Correct Themselves Automatically // Curtis Northcutt // MLOps Coffee Sessions #105

In this episode, Curtis Northcutt, CEO & Co-Founder of Cleanlab, discusses the importance of data-centric AI and the challenges of addressing noisy data. They also delve into the journey of Cleanlab in improving data labeling accuracy, the success of the startup in finding and correcting bad data, and the frustrations of bug smashing. Additionally, they explore the challenges of understanding the value and capabilities of AI tools and companies, as well as the hiring opportunities in DevRel and front-end engineering.
undefined
Jun 24, 2022 • 52min

MLOps + BI? // Maxime Beauchemin // MLOps Coffee Sessions #104

MLOps Coffee Sessions #104 with the creator of Apache Airflow and Apache Superset, Maxime Beauchemin, Future of BI co-hosted by Vishnu Rachakonda.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠⁠⁠// AbstractMaxime is one of the most influential thought leaders on the future of data engineering and how to drive business value from data with better data infrastructure and tooling. He’s worked in a number of the most cutting-edge data engineering environments. His articles have been read by thousands, and his creations (Superset and Airflow) power billions of market value. It’s no exaggeration to say that Maxime is one of the most essential contributors to the data and machine learning revolution underway.// BioMaxime Beauchemin is the founder and CEO of Preset. Original creator of Apache Superset.  Max has worked at the leading edge of data and analytics his entire career, helping shape the discipline in influential roles at data-dependent companies like Yahoo!, Lyft, Airbnb, Facebook, and Ubisoft.// MLOps Jobs board  jobs.mlops.communityMLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related LinksWebsite: https://www.rungalileo.io/Trade-Off: Why Some Things Catch On, and Others book by Kevin Maney:https://www.amazon.com/Trade-Off-Some-Things-Catch-Others/dp/0385525958--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/Connect with Max on LinkedIn: https://www.linkedin.com/in/maximebeauchemin/Timestamps:[00:00] Introduction to Maxime Beauchemin[01:28] Takeaways[03:42] Paradigm of data warehouse[06:38] Entity-centric data modeling[11:33] Metadata for metadata[14:24] Problem of data organization for a rapidly scaling organization[18:36] Machine Learning tooling as a subset or of its own[22:28] Airflow: The unsung hero of the data scientists[27:15] Analyzing Airflow[30:44] Disrupting the field[34:45] Solutions to the ladder problem of empowering exploratory work and mortals' superpowers with data[38:04] What to watch out for when building for data scientists  [41:47] Rapid-fire questions[51:12] Wrap up
undefined
4 snips
Jun 17, 2022 • 1h 5min

Making MLFlow // Lead MLFlow Maintainer Corey Zumar // MLOps Coffee Sessions #103

MLOps Coffee Sessions #103 with Corey Zumar, MLOps Podcast on Making MLflow co-hosted by Mihail Eric.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠⁠// AbstractBecause MLOps is a broad ecosystem of rapidly evolving tools and techniques, it creates several requirements and challenges for platform developers:- To serve the needs of many practitioners and organizations, it's important for MLOps platforms to support a variety of tools in the ecosystem. This necessitates extra scrutiny when designing APIs, as well as rigorous testing strategies to ensure compatibility.  - Extensibility to new tools and frameworks is a must, but it's important not to sacrifice maintainability. MLflow Plugins (https://www.mlflow.org/docs/latest/plugins.html) is a great example of striking this balance.  - Open source is a great space for MLOps platforms to flourish. MLflow's growth has been heavily aided by: 1. meaningful feedback from a community of ML practitioners with a wide range of use cases and workflows & 2. collaboration with industry experts from a variety of organizations to co-develop APIs that are becoming standards in the MLOps space.// BioCorey Zumar is a software engineer at Databricks, where he’s spent the last four years working on machine learning infrastructure and APIs for the machine learning lifecycle, including model management and production deployment. Corey is an active developer of MLflow. He holds a master’s degree in computer science from UC Berkeley.// MLOps Jobs board  jobs.mlops.community//MLOps Swag/Merch https://mlops-community.myshopify.com/// Related Links--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/Connect with Corey on LinkedIn: https://www.linkedin.com/in/corey-zumar/Timestamps:[00:00] Origin story of MLFlow[02:12] Spark as a big player[03:12] Key insights[04:42] Core abstractions and principles of MLFlow's success[07:08] Product development with open-source[09:29] Fine line between competing principles[11:53] Shameless way to pursue collaboration[12:24] Right go-to-market open-source[16:27] Vanity metrics[18:57] First gate of MLOps drug[22:11] Project fundamentals[24:29] Through the pillars[26:14] Best in breed or one tool to rule them all[29:16] MLOps space is mature with the MLOps tool[30:49] Ultimate vision for MLFlow[33:56] Alignment of end-users and business values[38:11] Adding a project abstraction separate from the current ML project[42:03] Implementing bigger bets in certain directions[44:54] Log in features to the experiment page[45:46] Challenge when operationalizing MLFlow in their stack[48:34] What would you work on if it weren't MLFlow?[49:52] Something to put on top of MLFlow[51:42] Proxy metric[52:39] Feature Stores and MLFlow[54:33] Lightning round [57:36] Wrap up
undefined
Jun 10, 2022 • 52min

Fixing Your ML Data Blind Spots // Yash Sheth // MLOps Coffee Sessions #102

MLOps Coffee Sessions #102 with Yash Sheth, Fixing Your ML Data Blindspots, co-hosted by Adam Sroka.  Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠// AbstractImproving your dataset quality is absolutely critical for effective ML. Finding errors in your datasets is generally a slow, iterative, and painstaking process.    Data scientists should be proactively fixing their models’ blind spots by improving their training data. In this talk, Yash discusses how Galileo helps data scientists identify, fix, and track data across the entire ML workflow.  // BioCo-founder and VP of Engineering. Prior to starting Galileo, Yash spent the last decade working on Automatic Speech Recognition (ASR) at Google, leading their core speech recognition platform team, which powers speech-to-text across 20+ products at Google in over 80 languages, along with thousands of businesses through their Cloud Speech API.  // MLOps Jobs board  jobs.mlops.communityMLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related LinksWebsite: https://www.rungalileo.io/Trade-Off: Why Some Things Catch On, and Others book by Kevin Maney:https://www.amazon.com/Trade-Off-Some-Things-Catch-Others/dp/0385525958--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Adam on LinkedIn: https://www.linkedin.com/in/aesroka/Connect with Yash on LinkedIn: https://www.linkedin.com/in/yash-sheth-72111216/Timestamps:[00:00] Introduction to Yash Sheth[02:53] Takeaways[04:35] Why unstructured data?[06:59] Fitting in the workflow[10:56] Digging into the different pains[18:23] Vision around the democratization of machine learning[24:31] Unstructured data problem[25:49] Galileo handling unified tools[27:21] Calculus for ML[28:45] Gatekeep[29:49] Synthetic data in the unstructured data world of Galileo[33:10] Tips for data scientists who have unstructured data but a small data set[35:00] Benefits of users from Galileo[37:15] Business case for dummies[42:36] War stories[44:49] Rapid-fire questions[50:55] Wrap up
undefined
Jun 3, 2022 • 59min

Declarative Machine Learning Systems: Big Tech Level ML Without a Big Tech Team // Piero Molino // MLOps Coffee Sessions #101

MLOps Coffee Sessions #101 with Piero Molino, Declarative Machine Learning Systems: Big Tech Level ML Without a Big Tech Team, co-hosted by Vishnu Rachakonda.// AbstractDeclarative Machine Learning Systems are the next step in the evolution of Machine Learning infrastructure.With such systems, organizations can marry the flexibility of low-level APIs with the simplicity of AutoML.Companies adopting such systems can increase the speed of machine learning development, reaching the quality and scalability that only big tech companies could achieve until now, without the need for a team of several thousand people.Predibase is the turnkey solution for adopting declarative ML systems at an enterprise scale.// BioPiero Molino is CEO and co-founder of Predibase, a company redefining ML tooling. Most recently, he has been a Staff Research Scientist at Stanford University, working on Machine Learning systems and algorithms in Prof. Chris Ré's Hazy group. Piero completed a Ph.D. in Question Answering at the University of Bari, Italy. Founded QuestionCube, a startup that built a framework for semantic search and QA. Worked for Yahoo Labs in Barcelona on learning to rank, IBM Watson in New York on natural language processing with deep learning, and then joined Geometric Intelligence, where he worked on grounded language understanding.After Uber acquired Geometric Intelligence, Piero became one of the founding members of Uber AI Labs. At Uber, he worked on research topics including Dialogue Systems, Language Generation, Graph Representation Learning, Computer Vision, Reinforcement Learning, and Meta-Learning. He also worked on several deployed systems like COTA, an ML and NLP model for Customer Support, Dialogue Systems for driver's hands-free dispatch, the Uber Eats Recommender System with graph learning and collusion detection. He is the author of Ludwig, a Linux-Foundation-backed open source declarative deep learning framework.// MLOps Jobs board  jobs.mlops.community// MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related LinksWebsite: http://w4nderlu.sthttp://ludwig.ai https://medium.com/ludwig-aiDeclarative Machine Learning Systems paper by Piero Molino, Christopher Ré: https://cacm.acm.org/magazines/2022/1/257445-declarative-machine-learning-systems/fulltextSlip of the Keyboard by Sir Terry Pratchett: https://www.terrypratchettbooks.com/books/a-slip-of-the-keyboard/The Listening Society book series by Hanzi Freinacht: https://www.amazon.com/Listening-Society-Metamodern-Politics-Guides-ebook/dp/B074MKQ4LR--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/Connect with Piero on LinkedIn: https://www.linkedin.com/in/pieromolino/?locale=en_USTimestamps:[00:00] Introduction to Piero Molino[01:09] Takeaways[02:52] Blogpost ideas of Demetrios and Vishnu[03:31] MLOps Swag/Merch[04:37] What does Predibase do?[07:40] Valuable paradigm of configuration over code[10:31] Predibase for ML business outcome [12:50] Query language to apply and configure models on top of data[13:17] Query meaning in Predibase[16:43] Training phase[19:20] Predibase Pequel System[20:30] Building Predibase?[22:52] Perception of one configuration is the right way to do things[26:10] Predibase edges and limits[30:09] Strong opinions about Predibase[32:56] Open-sourcing Ludwig[35:47] Future of work in the context of Predibase[40:27] Broadening skill sets[44:38] Declarative Machine Learning Systems paper[49:49] Lightning round[57:26] Predibase is hiring![57:49] Wrap up

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app