MLOps.community

Demetrios
undefined
Apr 19, 2021 • 51min

War Stories Productionising ML // Nick Masca // Coffee Session #35

Coffee Sessions #35 with Nick Masca of Marks and Spencer, War Stories Productionising ML.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠⁠// Abstract A conversation with MLOps war stories. Better said, a war story conversation. The kind that informs modern MLOps best practices.  Nick shared how to make MLOps organizational changes at large companies. I loved one tidbit he mentioned--"it's an evolution, not a revolution". That's a frank observation about the speed of practical change. As we all know, it doesn't happen overnight.  Another great learning Nick shared focused on the value of delivering incremental results regularly. Oftentimes, ML projects suffer because of a focus on delivering too much too soon. This can then lead to a trough of disappointment with the way things actually pan out. Nick shared his experience on how to avoid such pitfalls with us, so you don't have to learn the hard way.// BioNick currently serves as a Head of Data Science at Marks and Spencer, a large retailer based in the UK.  With a background originally in statistics, he transitioned into data science in 2014 and has picked up many battle scars and learnings since.//Link to the MLOps War Storieshttps://www.linkedin.com/posts/dpbrinkm_what-is-your-mlops-war-story-activity-6772604800971370496-LxtX--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerConnect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/Connect with Nick on LinkedIn: www.linkedin.com/in/nick-masca-09454956/Timestamps:[00:00] Introduction to Nick Masca [01:36] Nick's background in tech [05:01] Nick's current job [06:19] Building the basics [08:18] "If you can gain trust and demonstrate value early, you could also freeze yourself up to the tidy marks later." [09:19] Strategy on long-running vision [10:25] "Historically, the legacy waterfall processes in the business where teams have specialist responsibilities." [11:14] KPI's[12:36] KPI translations into action plans[15:43] Data scientists call [17:13] Nick's nightmarish story  [22:52] Making the case on such a nightmarish story[25:06] Tools used by Marks and Spencer in 2015  [27:15] More complicated process [28:08] Takeaways from experience[30:57] Obstacles in deploying[34:53] Simplifying models[37:31] Combining environments into one [38:45] "Having written standards can be quite helpful to take ownership and responsibility around that."  [40:23] M&S team interaction [41:31] "It's an evolution, it's not a revolution, I'd say at the moment, but there's definitely real emphasis where we are to improve things and work towards goals to enable our team to work quicker, empower them." [42:10] Team moralizing [43:11] Takeaways from war stories [43:30] "The biggest takeaway for me is to start small, keep things simple, try things, and it can be surprising sometimes what you'll find. Something simple can give you surprising results." [44:35] Opinions on Data Science and Machine Learning businesses democratize and commoditize
undefined
Apr 16, 2021 • 58min

Deploying Machine Learning Models at Scale in Cloud // Vishnu Prathish // MLOps Meetup #60

MLOps community meetup #60! Last Wednesday, we talked to Vishnu Prathish, Director of Engineering, AI Products, Innovyze.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠// AbstractThe way Data Science is done is changing. Notebook sharing and collaboration were messy, and there was minimal visibility or QA into the model deployment process. Vishnu will talk about building an ops platform that deploys hundreds of models at scale every month. A platform that supports typical features of MLOps (CI/CD, Separated QA, Dev, and PROD environment, experiments tracking, Isolated retraining, model monitoring in real-time, Automatic Retraining with live data) and ensures quality and observability without compromising the collaborative nature of data science.// BioWith 10 years in building production-grade data-first software at BBM & HP Labs, I started building Emagin's AI platform about three years ago with the goal of optimizing operations for the water industry. At Innovyze post-acquisition, we are part of the org building a world-leading water infrastructure data analytics product.//TakeawaysWhy is MLOps necessary for model building at scale?  What are various cloud-based models for MLOps?  Where can ops help in various points in the ML pipeline: Data Prep, Feature Engineering, Model building, Training, Retraining, Evaluation, and inference----------- Connect With Us ✌️-------------   Join our Slack community:  https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerConnect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vishnuprathish/Timestamps:[00:00] Introduction to Vishnu Prathish[00:16] Vishnu's background[04:18] Use cases on wooden pipes for freshwater[04:55] Virtual representation of actual, physical, tangible assets[06:56] Platform built by Vishnu[08:30] Build a reliable representation of the network[11:52] Pipeline architecture[16:17] "MLOps is still an evolving discipline. You need to try and fail many times before you figure out what's right for you."[17:11] Open-sourcing[18:17] Platform for virtual twin[20:02] Entirely Amazon Stagemaker[20:43] Data quality issues[23:21] Reproducibility[23:40] "Reproducibility is important for everybody. Most of the frameworks do that for you."[25:00] Reproducibility as Innovyze's core business.[26:38] Each model is individual to each customer[27:50] Solving reproducibility problems[28:24] "Reproducibility applies to the process of training pipelines. It starts with collecting historical raw data from customers. In real-time, there's also this data being collected directly from sensors coming from a certain pipeline."[31:55] "Reusable training is step one to attaining automated retraining."[32:17] Collaboration of Vishnu's team[36:23] War stories[41:36] Data prediction[44:24] "A data scientist is the most expensive hire you can make."[47:55] 3 Tiers[48:53] MLOps problems[52:25] Automatically retraining[52:34] "Because of the number of models that go through this pipeline, it's impossible for somebody to manually monitor and retrain as necessary. It's not easy, it takes a lot of time."[54:22] Metrics on retraining[56:42] "Retraining is a little less prevalent for our industry compared to a turned prediction model that changes a lot. There are external factors that depend on it, but a pump is a pump."
undefined
Apr 12, 2021 • 59min

Machine Learning at Atlassian // Geoff Sims // Coffee Session#34

Coffee Sessions #34 with Geoff Sims of Atlassian, Machine Learning at Atlassian.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter// Abstract As one of the world's most visible software companies, Atlassian's vast data and deep product suite pose an interesting MLOps challenge, and we're grateful to Geoff for taking us behind the curtain.//BioGeoff is a Principal Data Scientist at Atlassian, the software company behind Jira, Confluence & Trello. He works with the product teams and focuses on delivering smarter in-product experiences and recommendations to our millions of active users by using machine learning at scale. Prior to this, he was in the Customer Support & Success division, leveraging a range of NLP techniques to automate and scale the support function.Prior to Atlassian, Geoff has applied data science methodologies across the retail, banking, media,  and renewable energy industries. He began his foray into data science as a research astrophysicist, where he studied astronomy from the coldest & driest location on Earth: Antarctica.--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerConnect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/Connect with Geoff on https://www.linkedin.com/in/geoff-sims-0a37999b/Timestamps: [00:00] Introduction to Geoff Sims [01:20] Geoff's background [04:00] Evolution of ML Ecosystem in Atlassian [06:50] Figure out by necessity[08:47] Machine Learning is not priority number one and disconnected from MLOps [11:53] Atlassian being behind or advanced?[16:38] Serious switch of Atlassian around machine learning[17:47] What data org did it come from?[20:00] Consolidation of the stack[21:21] Tooling - blessing and curse[24:37] Tackling play out[29:38] Staying on the same page [30:48] Priority of needs[31:55] How did it evolve? [35:12] Where is Atlassian now?[40:21] "Architecturally, Tecton is very, very similar (to ours), it was just way more mature." [41:17] What unleashed you to do now? [41:36] "The biggest thing is independence from a data science perspective. Less reliance and less dependence on an army of engineers to help deploy features and models." [44:25] Have you bought other tools? [45:43] "At any given time, there's something that's a bottleneck. Look where the bottleneck is, then fix it and move on to the next thing."  [48:20] Atlassian is bringing a model into production [50:01] "When we undertake whatever the project is, it's days or weeks to go to a prototype rather than months or quarters." [53:10] "Conceptually, you're struggling walking towards that place because that's the place you want to be. If that's your problem, that's good. That's the promised land." [54:45] "Using our own tools is paramount because we are customers as well. So we see and feel the pain, which helps us identify the problems and understand them."
undefined
Apr 9, 2021 • 59min

MLOps Community 1 Year Anniversary! // Demetrios Brinkmann, David Aponte & Vishnu Rachakonda // MLOps Meetup #59

MLOps community meetup #59! Last Wednesday was the celebration of the MLOps Community's 1 Year Anniversary! This has been a conversation of Demetrios Brinkmann, David Aponte, and Vishnu Rachkonda!Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter//AbstractOver the past year, Demetrios, David, and Vishnu have interviewed many of the top names in MLOps. During this time, they have been able to apply these learnings at their jobs and see what works for them. In this one-year anniversary meetup, the three of them will discuss some of the most impactful advice they have received in the last year and how they have put them into practice.//BioDemetrios BrinkmannAt the moment, Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building LEGO houses with his daughter.David AponteDavid is one of the organizers of the MLOps Community. He is an engineer, teacher, and lifelong student. He loves to build solutions to tough problems and share his learnings with others. He works out of NYC and loves to hike and box for fun. He enjoys meeting new people, so feel free to reach out to him!Vishnu RachakondaVishnu is the operations lead for the MLOps Community and co-hosts the MLOps Coffee Sessions podcast. He is a machine learning engineer at Tesseract Health, a 4Catalyzer company focused on retinal imaging. In this role, he builds machine learning models for clinical workflow augmentation and diagnostics in on-device and cloud use cases. Since studying bioengineering at Penn, Vishnu has been actively working in the fields of computational biomedicine and MLOps. In his spare time, Vishnu enjoys suspending all logic to watch Indian action movies, playing chess, and writing.----------- Connect With Us ✌️-------------   Join our Slack community:  https://go.mlops.community/slackFollow us on Twitter:  @mlopscommunitySign up for the next meetup:  https://go.mlops.community/registerConnect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/Timestamps:[01:07] Big shoutout to everybody that's in these meetups![02:03] Big shoutout to Ivan Nardini for leading the Engineering Labs and to everyone who took part in the Engineering Labs![02:26] Big shoutout to Charlie, you're leading the Reading Group, and to everyone who takes part in it![02:39] Big shoutout to everyone who takes part in the Office Hours![02:49] Big shoutout to the people who are helping with shaping the website![03:34] Thanks to all the people in Slack! Laszlo, Ariel, and people answering Slack questions.[04:10] Big thanks to all our Sponsors FiddlerAI, Algorithmia, and Tecton!  [06:13] David's Background[08:08] Vishnu's Background[09:55] High-Level Points[15:57] Starting small[24:05] Over-optimization - the root of all evil [26:42] Keeping text deck open[36:45] Missing from current MLOps tooling [48:00] How to communicate in these data products?
undefined
Apr 6, 2021 • 46min

MLOps Investments // Sarah Catanzaro // Coffee Session #33

Coffee Sessions #33 with Sarah Catanzaro of Amplify Partners, MLOps Investments.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter//BioSarah Catanzaro is a Partner at Amplify Partners, where she focuses on investing in and advising high-potential startups in machine intelligence, data management, and distributed systems. Her investments at Amplify include startups like RunwayML, Maze Design, OctoML, and Metaphor Data, among others. Sarah also has several years of experience defining data strategy and leading data science teams at startups and in the defense/intelligence sector, including through roles at Mattermark, Palantir, Cyveillance, and the Center for Advanced Defense Studies.//We had a wide-ranging discussion with Sarah, three takeaways stood out:The relationship between unstructured data and structured data is due for change. In most settings, you have some form of structured data (i.e., a metadata table) and unstructured data (i.e., images, text, etc.) Managing the relationship between these forms of data can constitute the bulk of MLOps. Because of this difficulty, Sarah forecasted new tooling arising to make data management easier.Academic benchmarks suffer from a lack of transparency on production/industry use cases. In conversation with Andrew Ng, Sarah shared her lesson that despite all the blame industry professionals place on academics for narrowly optimizing to benchmarks with little practical meaning, they also share the blame for making it difficult to create meaningful benchmarks. Companies are loath to share realistic data and the true context in which ML has to operate.MLOps is due for consolidation, especially as companies adopt platform-driven strategies. As many of you all know, there are tons and tons of MLOps tools out there. As more companies address these challenges, Sarah predicted that many of the point solutions would start to be consolidated into larger platforms.// Related Linkshttps://amplifypartners.com/team/sarah/https://projectstoknow.amplifypartners.com/ml-and-datahttps://twitter.com/sarahcat21/status/1360105479620284419--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerConnect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/Connect with Sarah on LinkedIn: https://www.linkedin.com/in/sarah-catanzaro-9770b98/Timestamps: [00:00] Introduction to Sarah Catanzaro [02:07] Sarah's background in tech [06:00] Staying engineer-oriented despite being an investment firm[08:50] Tools you wished you had earlier in your career[12:36] 2 Motives of ML Engineers and ML Platform Team[16:36] Open-sourcing[21:29] Startup focuses on resources[23:57] Playout of open-source project[27:32] Consolidation[33:18] Finding solutions[36:18] Evolution of the MLOps industry in the coming years[42:36] Frameworks  [43:14] Structure data sets available to researchers. Meaningful advances in deep learning have been applied to structure data as well.
undefined
Apr 4, 2021 • 53min

Model Watching: Keeping Your Project in Production // Ben Wilson // MLOps Meetup #58

MLOps community meetup #58! Last Wednesday, we talked to Ben Wilson, Practice Lead Resident Solutions Architect, Databricks.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠⁠Model Monitoring Deep Dive with the author of Machine Learning Engineering in Action. It was a pleasure getting to talk to Ben about difficulties in monitoring in machine learning. His expertise obviously comes from experience, and as he said a few times in the meetup, I learned the hard way over 10 years as a data scientist, so you don't have to!Ben was also kind enough to give us a 35% off promo code for his book! Use the link: http://mng.bz/n2P5//AbstractA great deal of time is spent building out the most effectively tuned model, production-hardened code, and elegant implementation for a business problem. Shipping our precious and clever gems to production is not the end of the solution lifecycle, though, and many abandoned projects can attest to this. In this talk, we will discuss how to think about model attribution, monitoring of results, and how (and when) to report those results to the business to ensure a long-lived and healthy solution that actually solves the problem you set out to solve.//BioBen Wilson has worked as a professional data scientist for more than ten years. He currently works as a resident solutions architect at Databricks, where he focuses on machine learning production architecture with companies ranging from 5-person startups to global Fortune 100. Ben is the creator and lead developer of the Databricks Labs AutoML project, a Scala-and Python-based toolkit that simplifies machine learning feature engineering, model tuning, and pipeline-enabled modeling. He's the author of Machine Learning Engineering in Action, a primer on building, maintaining, and extending production ML projects.//TakeawaysUnderstanding why attribution and performance monitoring are critical for long-term project successBorrowing hypothesis testing, stratification for latent confounding variable minimization, and statistical significance estimation from other fields can help to explain the value of your project to a businessUnlike in street racing, drifting is not cool in ML, but it will happen. Being prepared to know when to intervene will help keep your project running.----------- Connect With Us ✌️-------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerConnect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Ben on LinkedIn: www.linkedin.com/in/benjamin-wilson-arch/Timestamps:[00:00] Introduction to Ben Wilson[00:11] Ben's background in tech[03:40] Human aspect of Machine Learning in MLOps[05:51] MLOps is an organizational problem[09:27] Fragile Models[12:36] Fraud Cases[15:21] Data Monitoring[18:37] Importance of knowing what to monitor for[22:00] Monitoring for outliers[24:16] Staying out of Alert Hell[29:40] Ground Truth[31:25] Model vs Data Drift on Ground Truth Unavailability[34:25] Benefit to monitor system or business-level metrics[38:20] Experiment in the beginning, not at the end[40:30] Adaptive windowing[42:22] Bridge the gap[46:42] What scarred you really bad?
undefined
Mar 26, 2021 • 56min

A Missing Link in the ML Infrastructure Stack // Josh Tobin // MLOps Meetup #57

MLOps community meetup #57! Last Wednesday, we talked to Josh Tobin, Founder, Stealth-Stage Startup.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠⁠// Abstract:Machine learning is quickly becoming a product engineering discipline. Although several new categories of infrastructure and tools have emerged to help teams turn their models into production systems, doing so is still extremely challenging for most companies. In this talk, we survey the tooling landscape and point out several parts of the machine learning lifecycle that are still underserved. We propose a new category of tool that could help alleviate these challenges and connect the fragmented production ML tooling ecosystem. We conclude by discussing similarities and differences between our proposed system and those of a few top companies.// Bio:Josh Tobin is the founder and CEO of a stealth machine learning startup. Previously, Josh worked as a deep learning & robotics researcher at OpenAI and as a management consultant at McKinsey. He is also the creator of Full Stack Deep Learning (fullstackdeeplearning.com), the first course focused on the emerging engineering discipline of production machine learning. Josh did his PhD in Computer Science at UC Berkeley, advised by Pieter Abbeel.// Related Linkshttps://josh-tobin.comcourse.fullstackdeeplearning.com----------- Connect With Us ✌️-------------   Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerConnect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Josh on LinkedIn: https://www.linkedin.com/in/josh-tobin-4b3b10a9/Timestamps:[00:00] Introduction to Josh Tobin[01:18] Background of Josh in tech[08:27] We're you guys behind the Rubik's Cube?[09:26] Rubik's Cube Project[09:51] "Research is meant to show you what's possible to solve."[11:07] "That's one of the things that's started to change, and I think the MLOps world is maybe a part of that. What I'm excited about this is that people are focusing on the impact of their models."[13:18] Insights on Testing[17:11] Evaluation Store[18:33] "Production Machine Learning is data-driven products that have predictions in the loop."[23:40] Analyzing and moving forward[24:02] "My medium-term mindset on how machine learning is created is that there's still gonna be humans involved, but humans will be more efficient with tools."[25:50] Is there a market for this?[27:40] "The long tale of machine learning use cases is becoming part of every product and service, more or less, the companies create, but it's the same way the software part of the products and services the companies create these days. It's going to create an enormous amount of value."[30:09] Talents[32:52] Organizational by-ends and knowledge[35:16] Tools used for Evaluation Store39:59] Difference from Monitoring Tool[42:10] Who is the right person to interact with in the Evaluation Store?[50:05] Technical challenges of Apple and Tesla[53:30] "As Machine Learning use cases are getting more and more complicated, higher and higher dimensional data, bigger and bigger models, larger training sets, many companies would need in order to continually improve their systems over time."
undefined
Mar 23, 2021 • 52min

The Godfather Of MLOps // D. Sculley // MLOps Coffee Sessions #32

Coffee Sessions #32 with D. Sculley of Google, The Godfather Of MLOps.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠//BioD is currently a director in Google Brain, leading research teams working on robust, responsible, reliable, and efficient ML and AI. In his time at Google, D worked on nearly every aspect of machine learning and has led both product and research teams, including those on some of the most challenging business problems.// Links to D. Sculley's PapersML Test Score: https://research.google/pubs/pub46555/Machine Learning: The high-interest credit card of technical debthttps://research.google/pubs/pub43146/Google Scholar:https://scholar.google.com/citations?user=l_O64B8AAAAJ&hl=en--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerConnect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/Connect with D. Sculley on LinkedIn: https://www.linkedin.com/in/d-sculley-90467310/Timestamps:[00:00] Introduction to D. Sculley[00:40] The Biggest Papers were written by D for Machine Learning[02:08] What's changed since you wrote those papers?[02:56] "No 1, there is an MLOps community."[04:38] Old best practices[05:12] "The fact that there are jobs titled MLOps, this is different than it was 5 or 6 years ago."[06:30] Machine Learning Systems then and now[07:08] "There wasn't the level of general infrastructure that was looking to offer the large-scale integrated solutions."   [07:57] ML Test Score[11:09] "The Test Score was really written for situations where you don't care about one prediction. You care about millions or billions of predictions per day."[12:27] "In the end, it's not about the score. It's about the process of asking the questions, making sure that each of the important questions that you're asking yourself, you have a good answer to."  [13:04] What else is needed in the Test Score?[14:36] Stratified testing  [17:05] Counterfactual testing[18:34] Boundaries[19:15] Dark ages[20:27] How do you try in Triage?[21:10] "Reliability is important. There are no small mistakes. If there are errors, they're going to get spotted and publicized. They're going to impact users' lives. The bar is really high, and it's worth the effort to ensure strong reliability."[23:11] How do you build that interest stress test?[24:39] "I believe that stress test is going to look like a useful way to encode expert knowledge about domain areas."[25:37] How do I bring robustness?[27:22] Underspecification Paper[30:58] "It's important to be evaluating models on this auto domain stress test and make sure that we understand the implications of what we're thinking about while we are in deployment land."[32:27] Principal challenges in productionizing Machine Learning[34:57] "As we expose our models to more specifics, this means there are more potential places our models might be exhibiting unexpected or undesirable behaviour."[42:37] Splintering of ML Engineering[46:00] Communities shaping the MLOps sphere[46:42] "It's much better to have one large community than three smaller communities because of those edufacts."[47:47] Concept of technical debt in machine learning.[49:28] "The good idea is to tend to make their way into the community if they are in a form that people can digest and share."
undefined
Mar 19, 2021 • 1h 5min

Operationalizing Machine Learning at a Large Financial Institution // Daniel Stahl // MLOps Meetup #56

MLOps community meetup #56! Last Wednesday, we talked to  Daniel Stahl, Head of Data and Analytics Platforms, Regions Bank.Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter// Abstract:The Data Science practice has evolved significantly at Regions, with a corresponding need to scale and operationalize machine learning models. Additionally, highly regulated industries such as finance require a heightened focus on reproducibility, documentation, and model controls.  In this session with Daniel Stahl, we will discuss how the Regions team designed and scaled their data science platform using DevOps and MLOps practices.  This has allowed Regions to meet the increased demand for machine learning while embedding controls throughout the model lifecycle.  In the 2 years since the data science platform has been onboarded, 100% of data products have been successfully operationalized.// Bio:Daniel Stahl leads the ML platform team at Regions Bank and is responsible for tooling, data engineering, and process development to make operationalizing models easy, safe, and compliant for Data Scientists.  Daniel has spent his career in financial services and has developed novel methods for computing tail risk in both credit risk and operational risk, resulting in peer-reviewed publications in the Journal of Credit Risk and the Journal of Operational Risk. Daniel has a Master's in Mathematical Finance from the University of North Carolina at Charlotte.     Daniel lives in Birmingham, Alabama, with his wife and two daughters.----------- Connect With Us ✌️-------------   Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerConnect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Dan on LinkedIn: https://www.linkedin.com/in/daniel-stahl-6685a52a/Timestamps: [00:00] Introduction to Ben Wilson [00:11] Ben's background in tech [01:17] "How do you do what I have always done pretty well, which is being as lazy as possible in order to automate things that I hate doing. So I learned about Regression Problems." [03:40] Human aspect of Machine Learning in MLOps[05:51] MLOps is an organizational problem[09:27] Fragile Models[12:36] Fraud Cases[15:21] Data Monitoring[18:37] Importance of knowing what to monitor for[22:00] Monitoring for outliers[24:16] Staying out of Alert Hell[29:40] Ground Truth[31:25] Model vs Data Drift on Ground Truth Unavailability [34:25] Benefit to monitor system or business-level metrics [38:20] Experiment in the beginning, not at the end [40:30] Adaptive windowing [42:22] Bridge the gap [46:42] What scarred you really bad?
undefined
Mar 12, 2021 • 58min

How to Avoid Suffering in Mlops/Data Engineering Role // Igor Lushchyk // MLOps Meetup #55

MLOps community meetup #55! Last Wednesday, we talked to Igor Lushchyk, Data Engineer, Adyen.  Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter// Abstract:Building Data Science and Machine Learning platforms at a scale-up. Having the main difficulty in finding the correct processes, and basically being a toddler who learns how to walk on a steep staircase. The transition from homegrown platforms to open source solutions, supporting old solutions and maturing them, makes data scientists happy.  // Bio:Igor is a software engineer with more than 10 years of experience. With a background in bioinformatics, he even started a PhD but didn't finish it.As a data engineer, Igor has been working for the last 6 or 7 years, or maybe more, because he was doing almost the same data engineering stuff, but his position was named differently.Igor has been doing a lot of MLOps in 4-5 years now. He doesn't know what he was doing more than - Data Engineering or MLOps. And that’s how this topic came about.  ----------- Connect With Us ✌️-------------   Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerConnect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Igor on LinkedIn: https://www.linkedin.com/in/igor-lushchyk/Timestamps:[00:00] Introduction to Igor Lushchyk[02:05] Igor's background in tech[07:42] Tips you can pass on[11:05] How these tools work, and how they play together, and what is underneath?[13:18] Dedicated MLOps team[13:55] Central Data Infrastructure Section[16:57] Transfer over to open-source[20:24] If you don't plan for production from the beginning, then it's going to be painful trying to go from POC to production.[22:08] How do you handle data lineage?[25:09] You chose that back in the day, but you're regretting it.[26:34] "Try to use tools which solve 80% of your use cases, and maybe 20% you'll have the suffering, but at least it's not 100% suffering."[27:27] Friction points[28:53] Interaction with Data Scientists[29:21] "We have alignment sessions. We have different levels of representation. We share our progress."[32:42] Build verse by decisions[34:04] When to build or grab an open-source tool[35:51] Build your own or buy open-source?[37:11] Certain maturity and a certain number of engineers[38:11] Startup to go with open-source[40:14] Correct transition process[40:56] "There are no other ways but to communicate with data scientists. Your team needs to have a close loop for future priorities, what to take with you, and what to leave behind."[44:51] What to use in the monitoring piece[45:36] Prometheus and Grafana[48:07] Do you have automatic retriggering monitoring of Models set up?[51:55] Hardware for on-prim model training[52:38] "Machine Learning model prediction is a spear bomb."[53:55] War or horror stories[54:15] "Guys, don't do context switching!"[55:54] "I won't say that Adyen is a company that allows you to make mistakes, but you can make mistakes."

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app