

MLOps.community
Demetrios
Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)
Episodes
Mentioned books

Jul 4, 2023 • 1h
Open Source and Fast Decision Making // Rob Hirschfeld // #164
MLOps Coffee Sessions #164 with Rob Hirschfeld, Open Source and Fast Decision Making. This episode is brought to you by.// AbstractRob Hirschfeld, the CEO and co-founder of Rack N, discusses his extensive experience in the DevOps movement. He shares his notable achievement of coining the term "the cloud" and obtaining patents for infrastructure management and API provision. Rob highlights the stagnant progress in operations and the persistent challenges in security and access controls within the industry. The absence of standardization in areas such as Kubernetes and single sign-on complicates the development of robust solutions. To address these issues, Rob underscores the significance of open-source practices, automation, and version control in achieving operational independence and resilience in infrastructure management.// BioRob is the CEO and Co-founder of RackN, an Austin-based start-up that develops software to help automate data centers, which they call Digital Rebar. This platform helps connect all the different pieces and tools that people use to manage infrastructure into workflow pipelines through seamless multi-component automation across the different pieces and parts needed to bring up IT systems, platforms, and applications. Rob has a background in Scale Computing, Mechanical and Systems Engineering, and specializes in large-scale complex systems that are integrated with the physical environment. He has founded companies and been in the cloud and infrastructure space for nearly 25 years, and has done everything from building the first Clouds using ESXi betas to serving four terms on the OpenStack Foundation Board. Rob was trained as an Industrial Engineer and holds degrees from Duke University and Louisiana State University.// MLOps Jobs board jobs.mlops.community// MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related Linkshttps://rackn.com/https://robhirschfeld.com/about/--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Rob on LinkedIn: https://www.linkedin.com/in/rhirschfeld/Timestamps:[00:00] Rob's preferred coffee[00:17] Rob Hirschfeld's background[01:42] Takeaways[02:36] Please like, share, and subscribe to this channel![03:09] Creation of Cloud[08:38] Changes in Cloud after 25 Years[10:54] Pros and cons of microservices[13:06] Secure Access Provisioning[15:46] Parallelism with ads[18:08] Redfish protocol[20:21] Impact of using open source vs using a SAS provider[26:15] Automation[32:39] Embrace Operational Flexibility[35:08] Automating infrastructure inefficiently[41:26] Legacy code and resiliency[43:39] Collection of metadata[45:50] RackN[51:23] Granular Cloud Preferences[54:35] Reframing of perceived complexity[57:32] Generative DevOps[58:50] Wrap up

46 snips
Jun 27, 2023 • 54min
Democratizing AI // Yujian Tang // #163
MLOps Coffee Sessions #163 with Yujian Tang, Democratizing AI, co-hosted by Abi Aryan. // AbstractThe popularity of ChatGPT has brought large language model (LLMs) apps and their supporting technologies to the forefront. One of the supporting technologies is vector databases. Yujian shares how vector databases like Milvus are used in production and how they solve one of the biggest problems in LLM app building - data issues. They also discuss how Zilliz is democratizing vector databases through education, expanding access to technologies, and technical evangelism. // BioYujian Tang is a Developer Advocate at Zilliz. He has a background as a software engineer working on AutoML at Amazon. Yujian studied Computer Science, Statistics, and Neuroscience with research papers published at conferences, including IEEE Big Data. He enjoys drinking bubble tea, spending time with family, and being near water.// MLOps Jobs board jobs.mlops.community// MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related Links--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Yujian on LinkedIn: https://www.linkedin.com/in/yujiantangTimestamps:[00:00] Yujian's preferred coffee[02:40] Takeaways[05:14] Please share this episode with your friends![06:39] Vector databases trajectory[09:00] 2 start-up companies created by Yujian[09:39] Uninitiated Vector Databases[12:20] Vector Databases trade-off[14:16] Difficulties in training LLMs[23:30] Enterprise use cases[27:38] Process/rules not to use LLMs unless necessary[32:14] Setting up returns[33:13] When not to use Vector Databases[35:30] Elastic search[36:07] Generative AI apps' common pitfalls[39:35] Knowing your data[41:50] Milvus[48:28] Actual Enterprise use cases[49:32] Horror stories[50:31] Data mesh[51:06] GPTCash[52:10] Shout out to the Seattle Community![53:44] Wrap up

Jun 20, 2023 • 45min
From Arduinos to LLMs: Exploring the Spectrum of ML // Soham Chatterjee // #162
MLOps Coffee Sessions #162 with Soham Chatterjee, From LLMs to TinyML: The Dynamic Spectrum of MLOps, co-hosted by Abi Aryan. // AbstractExplore the spectrum of MLOps from large language models (LLMs) to TinyML. Soham highlights the difficulties of scaling machine learning models and cautions against relying exclusively on OpenAI's API due to its limitations. Soham is particularly interested in the effective deployment of models and the integration of IoT with deep learning. He offers insights into the challenges and strategies involved in deploying models in constrained environments, such as remote areas with limited power, and utilizing small devices like Arduino Nano.// BioSoham leads the machine learning team at Sleek, where he builds tools for automated accounting and back-office management. As an electrical engineer, Soham has a passion for the intersection of machine learning and electronics, specifically TinyML/Edge Computing. He has several courses on MLOps and TinyMLOps available on Udacity and LinkedIn, with more courses in the works.// MLOps Jobs board jobs.mlops.community// MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related Links--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Abi on LinkedIn: https://www.linkedin.com/in/goabiaryan/Connect with Soham on LinkedIn: https://www.linkedin.com/in/soham-chatterjeeTimestamps:[00:00] Soham's preferred coffee[01:49] Takeaways[05:33] Please share this episode with [07:02] Soham's background[09:00] From electrical engineering to Machine Learning[10:40] Deep learning, Edge Computing, and Quantum Computing[11:34] Tiny ML[13:29] Favorite area in Tiny ML chain[14:03] Applications explored[16:56] Operational challenges transformation[18:49] Building with Large Language Models[25:44] Most Optimal Model[26:33] LLMs path[29:19] Prompt engineering[33:17] Migrating infrastructures to a new product[37:20] Your success where others failed[38:26] API Accessibility[39:02] Reality about LLMs[40:39] The Compression angle adds to the bias[43:28] Wrap up

Jun 13, 2023 • 51min
The Long Tail of ML Deployment // Tuhin Srivastava // #161
MLOps Coffee Sessions #161 with Tuhin Srivastava, The Long Tail of ML Deployment, co-hosted by Abi Aryan. This episode is brought to you by QuantumBlack.// AbstractBaseten is an engineer-first platform designed to alleviate the engineering burden for machine learning and data engineers. Tuhin's perspective, based on research with Stanford students, emphasizes the importance of engineers embracing the engineering aspects and considering them from a reproductive perspective.// BioTuhin Srivastava is the co-founder and CEO of Baseten. Tuhin has spent the better part of the last decade building machine learning-powered products and is currently working on empowering engineers to build production-grade services with machine learning.// MLOps Jobs board jobs.mlops.community// MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related LinksQuantumBlack: https://www.mckinsey.com/capabilities/quantumblack/contact-usBaseten: https://www.baseten.co/Baseten Careers: https://www.baseten.co/careers--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Abi on LinkedIn: https://www.linkedin.com/in/goabiaryan/Connect with Tuhin on LinkedIn: https://www.linkedin.com/in/tuhin-srivastava-60601114/Timestamps:[00:00] Partnership with QuantumBlack[00:16] Nayur Khan presenting QuantumBlack[03:35] QuantumBlack is hiring![03:47] Tuhin's preferred coffee[05:03] Takeaways[07:00] Please share this episode with a friend![07:12] Comments/Reviews[08:49] Tuhin's background[10:13] Finance and Law common complaint culture [11:40] Doing Machine Learning in 2010 - 2011[14:31] Gum broad or the next company shape?[16:33] Engineers need to learn machine learning[20:18] Software engineers need to dig deeper[24:49] Cambrian Explosion[27:53] The Holy Trifecta[28:54] Objective truth and prompting[31:23] Limitations of LLMs[35:37] Documentation challenges[38:25] Baseten creating valuable models[40:37] Advocate for Microservices or API-based solution [42:54] Learning Git pains[44:16] Baseten back ups[48:00] Baseten is hiring![49:32] Wrap up

22 snips
Jun 7, 2023 • 46min
Clean Code for Data Scientists // Matt Sharp // # 160
MLOps Coffee Sessions #160 with Matt Sharp, Data Developer at Shopify, Clean Code for Data Scientists, co-hosted by Abi Aryan.// AbstractLet's delve into Shopify's real-time serving platform, Merlin, which enables features like recommender systems, inbox classification, and fraud detection. Matt shares his insights on clean coding and the new book he is writing about LLMs in production.// BioMatt is a Chemical Engineer turned Data scientist turned Data Engineer. Self-described "Recovering Data Scientist", Matt got tired of all the inefficiencies he faced as a Data Scientist and made the switch to Data Engineering. At Matt's last job, he ended up building the entire MLOps platform from scratch for a fintech startup called MX. Matt gives tips to data scientists on LinkedIn on how to level up their careers and has started to be known for my clean code tips in particular. Matt recently started a new job at Shopify.// MLOps Jobs board jobs.mlops.community// MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related Links--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Abi on LinkedIn: https://www.linkedin.com/in/goabiaryan/Connect with Matt on LinkedIn: https://www.linkedin.com/in/matthewsharp/Timestamps:[00:00] Matt's preferred drink[00:54] Takeaways[03:04] Watch out for Matt's LLMs in Production book coming up![03:29] Please like, share, subscribe, and join the upcoming LLMs in Production Conference Part 2![05:07] Raising awareness about the fundamental problems of writing clean code[07:57] Definition of clean code[09:46] Communicable clean code[13:52] Getting out of Jupyter notebooks at the end of their life[17:21] Exploratory data analysis[21:22] Most popular post on LinkedIn[26:41] Zilliz Ad[27:44] Best practices on production-level software engineering[29:41] Merlin[33:51] Upcoming Shopify projects[39:10] Matt's upcoming LLMs in Production book[45:06] LLMs in Production book Early Access[46:00] Wrap up

May 30, 2023 • 55min
Why is MLOps Hard in an Enterprise? // Maria Vechtomova & Basak Eskili // #159
MLOps Coffee Sessions #159 with Maria Vechtomova, Lead ML engineer, and Basak Eskili, Machine Learning Engineer, at Ahold Delhaize. Why is MLOps Hard in an Enterprise? co-hosted by Abi Aryan.// AbstractMLOps is particularly challenging to implement in enterprise organizations due to the complexity of the data ecosystem, the need for collaboration across multiple teams, and the lack of standardization in ML tooling and infrastructure. In addition to these challenges, at Ahold Delhaize, there is a requirement for the reusability of models as our brands seek to have similar data science products, such as personalized offers, demand forecasts, and cross-sell.// BioMaria VechtomovaMaria is a Machine Learning Engineer at Ahold Delhaize. Maria is bridging the gap between data scientists, infra, and IT teams at different brands and focuses on standardization of machine learning operations across all the brands within Ahold Delhaize. During nine years in Data&Analytics, Maria tried herself in different roles, from data scientist to machine learning engineer, was part of teams in various domains, and has built broad knowledge. Maria believes that a model only starts living when it is in production. For this reason, last six years, her focus has been on the automation and standardization of processes related to machine learning.Basak EskiliBasak Eskili is a Machine Learning Engineer at Ahold Delhaize. She is working on creating new tools and infrastructure that enable data scientists to quickly operationalize algorithms. She is bridging the space between data scientists and platform engineers while improving the way of working in accordance with MLOps principles. In her previous role, she was responsible for bringing models to production. She focused on NLP projects and building data processing pipelines. Basak also implemented new solutions by using cloud services for existing applications and databases to improve time and efficiency.// MLOps Jobs board jobs.mlops.community// MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related LinksMLOps Maturity Assessment Blog: https://mlops.community/mlops-maturity-assessment/The Minimum Set of Must-Haves for MLOps Blog: https://mlops.community/the-minimum-set-of-must-haves-for-mlops/Traceability & Reproducibility Blog: https://mlops.community/traceability-reproducibility/--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Abi on LinkedIn: https://www.linkedin.com/in/goabiaryan/Connect with Maria on LinkedIn: https://www.linkedin.com/in/maria-vechtomova/Connect with Basak on LinkedIn: https://www.linkedin.com/in/ba%C5%9Fak-tu%C4%9F%C3%A7e-eskili-61511b58/Timestamps:[00:00] Maria & Basak's preferred coffee[00:59] LLMs in Production Conference Part 2 coming up on June 15-16![02:08] Maria & Basak's background[02:47] Takeaways[04:52] A colorful history[06:59] 4 levels of evolution[08:15] Standardization and Model Registry Evolution[11:52] Ahold Delhaize Standard task[15:05] Ahold Delhaize Workflow[25:19] Avoiding tooling sprawl[28:10] Guardrails[29:50] Secret sharing and credential sharing sloppy processes[32:23] Distrust between DevOps engineers and data scientists[33:29] MLOps vs DevOps[35:31] Monitoring pieces heroes[38:32] Future accumulative cost issues[40:09] Exploratory phase in notebooks

May 16, 2023 • 1h 15min
Large Language Models at Cohere // Nils Reimers // #158
MLOps Coffee Sessions #158 with Nils Reimer, MLOps Build or Buy, Large Language Model at Scale, co-hosted by Abi Aryan.// AbstractLarge Language Models with billions of parameters have the possibility to change how we work with textual data. However, running them on scale at potentially hundreds of millions of texts a day is a massive challenge. Nils talks about finding the right model size for respective tasks, model distillation, and promising new ways of transferring knowledge from large to smaller models.// BioNils Reimers is highly recognized throughout the AI community for creating and maintaining the now-famous Sentence Transformers library (www.SBERT.net) used to develop, train, and use state-of-the-art LLMs. The project has 900+ stars on GitHub and 30M+ installations. Nils is currently the Director of Machine Learning at Cohere, where he leads the team that develops and trains Large Language Models (LLMs) with billions of parameters. Prior to Cohere, Nils created and led the science team for Neural Search at HuggingFace. Nils holds a Ph.D. in Computer Science from UKP in Darmstadt.// MLOps Jobs board jobs.mlops.community// MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related Links(www.SBERT.net)https://www.nils-reimers.de/--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Abi on LinkedIn: https://www.linkedin.com/in/goabiaryan/Connect with Nils on LinkedIn: https://www.linkedin.com/in/reimersnils/Timestamps:[00:00] Nils' preferred coffee[00:45] Nils' background[01:30] Takeaways[06:47] Subscribe to our Newsletters and IRL Meetups, and leave your reviews![07:32] Nils' history[10:39] From IT Security to Machine Learning[13:22] Tangibility of IT and Security[14:46] NLP transition[15:55] Bad augmentation to new capabilities of LLMs[16:59] Nils' concern during his PH.D.[19:55] Making Money from Machine Learning[22:06] Catastrophic forgetting[26:34] Updating solutions[28:42] Neural search space and building adaptive models[31:23] Filtering models[32:32] Latency issues[36:53] Models running in parallel[37:54] Generative models problems[38:43] Nils' role at Cohere[41:41] To build or not to build API[43:00] Search models[45:38] Large use cases[46:43] Open source discussion within Cohere[50:48] Competitive Edge[55:27] Future world of API[58:14] LLMs in Production Conference Part 2 announcement![1:00:17] Hopeful direction of Cohere's future[1:02:33] Data silos [1:04:34] Where to update the database and code[1:05:24] Nils' focus[1:08:49] Make money or save money[1:10:30] Cohere's upcoming project[1:12:37] Time spent red teaming the models[1:14:05] Wrap up

May 12, 2023 • 25min
Data Privacy and Security // LLMs in Production Conference Panel Discussion
We are having another LLMs in-production Virtual Conference. 50+ speakers combined with in-person activities around the world on June 15 & 16.
Sign up free here:
https://home.mlops.community/home/events/llm-in-prod-part-ii-2023-06-20
// Abstract
This panel discussion is centered around a crucial topic in the tech industry - data privacy and security in the context of large language models and AI systems. The discussion highlights several key themes, such as the significance of trust in AI systems, the potential risks of hallucinations, and the differences between low and high-affordability use cases.
The discussion promises to be thought-provoking and informative, shedding light on the latest developments and concerns in the field. We can expect to gain valuable insights into an issue that is becoming increasingly relevant in our digital world.
// Bio
Diego Oppenheimer
Diego Oppenheimer is an entrepreneur, product developer, and investor with an extensive background in all things data. Currently, he is a Partner at Factory a venture fund specializing in AI investments as well as interim head of product at two LLM startups. Previously he was an executive vice president at DataRobot, Founder, and CEO at Algorithmia (acquired by DataRobot), and shipped some of Microsoft’s most used data analysis products including Excel, PowerBI, and SQL Server.
Diego is active in AI/ML communities as a founding member and strategic advisor for the AI Infrastructure Alliance and MLops.Community and works with leaders to define ML industry standards and best practices. Diego holds a Bachelor's degree in Information Systems and a Masters degree in Business Intelligence and Data Analytics from Carnegie Mellon University
Gevorg Karapetyan
Gevorg Karapetyan is the co-founder and CTO of ZERO Systems where he oversees the company's product and technology strategy. He holds a Ph.D. in Computer Science and is the author of multiple publications, including a US Patent.
Vin Vashishta
C-level Technical Strategy Advisor and Founder of V Squared, one of the first data science consulting firms. Our mission is to provide support and clarity for our clients’ complete data and AI monetization journeys.
Over a decade in data science and a quarter century in technology building and leading teams and delivering products with $100M+ in ARR.
Saahil Jain
Saahil Jain is an engineering manager at You.com. At You.com, Saahil builds search, ranking, and conversational AI systems. Previously, Saahil was a graduate researcher in the Stanford Machine Learning Group under Professor Andrew Ng, where he researched topics related to deep learning and natural language processing (NLP) in resource-constrained domains like healthcare. Prior to Stanford, Saahil worked as a product manager at Microsoft on Office 365. He received his B.S. and M.S. in Computer Science at Columbia University and Stanford University respectively.
Shreya Rajpal
Shreya is the creator of Guardrails AI, an open-source solution designed to establish guardrails for large language models. As a founding engineer at Predibase, she helped build the Applied ML and ML infra teams. Previously, she worked at Apple's Special Projects Group on cross-functional ML, and at Drive.ai building computer vision models.

May 9, 2023 • 50min
MLOps Build or Buy, Startup vs. Enterprise? // Aaron Maurer & Katrina Ni # 157
MLOps Coffee Sessions #157 with Katrina Ni & Aaron Maurer, MLOps Build or Buy, Startup vs. Enterprise? co-hosted by Jake Noble of tecton.ai.This episode is sponsored by Tecton - Check out their feature store to get your real-time ML journey started.// AbstractThere are a bunch of challenges with building useful machine learning at a B2B software company like Slack, but we've built some cool use cases over the years, particularly around recommendations. One of the key challenges is how to train powerful models while being prudent stewards of our clients' essential business data, and how to do so while respecting the increasingly complex landscape of international data regulation.// BioKatrina Ni Katrina is a Machine Learning Engineer in Slack ML Services Team, where they build ML platforms and integrate ML, e.g., Recommend API, Spam Detection, across product functionalities. Prior to Slack, she was a Software Engineer in Tableau's Explain Data Team, where they built tools that utilize statistical models and propose possible explanations to help users inspect, uncover, and dig deeper into the viz. Aaron MaurerAaron is a senior engineering manager in the infra organization at Slack, managing both the machine learning team and the real-time services team. In six years at Slack, most of which Aaron spent as an engineer, He worked on the search ranking, recommendation, spam detection, performance anomaly detection, and many other ML applications. Aaron is also an advisor to Eppo, an experimentation platform. Prior to Slack, Aaroon worked as a data scientist at Airbnb, earned a Master's in statistics at the University of Chicago, and helped develop econometric models projecting the Obamacare rollout at Acumen LLC.// MLOps Jobs board jobs.mlops.community// MLOps Swag/Merchhttps://mlops-community.myshopify.com/// Related Links--------------- ✌️Connect With Us ✌️ -------------Join our Slack community: https://go.mlops.community/slackFollow us on Twitter: @mlopscommunitySign up for the next meetup: https://go.mlops.community/registerCatch all episodes, blogs, newsletters, and more: https://mlops.community/Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/Connect with Jake on LinkedIn: https://www.linkedin.com/in/jakednoble/Connect with Katrina on LinkedIn: https://www.linkedin.com/in/katrina-ni-660b2590/Connect with Aaron on LinkedIn: https://www.linkedin.com/in/aaron-maurer-4003b638/Timestamps:[00:00] Aaron and Katrina's preferred coffee[00:41] Recommender and System and Jake[02:06] Takeaways[05:38] Introduction to Aaron Maurer & Katrina Ni[06:53] Aaron Maurer & Katrina Ni's Recommend API blog post[08:36] 10-pole machine learning use case and Rex's use case[10:14] Genesis of Slack's recommender system framework[11:47] The Special Sauce[12:58] Speaking the same language[15:23] Use case sources[17:08] Slack's feature engineering[17:52] Main CTR models [18:40] Data privacy[21:33] Slack's recommendations problem[22:09] Fine-tuning the generative models[23:30] Cold start problem [26:02] Underrated[28:24] Baseline[28:55] Cold sore space[30:15] LLMs in Production Conference Part 2 announcement![32:32] Data scientists transition to ML[33:35] Unicorns do exist![34:43] Diversity of skill set[36:02] The future of ML[38:34] Model Serving[40:11] MLOps Maturity level[43:06] AWS Analogy[45:05] Primary difficulty[48:07] Wrap up

May 6, 2023 • 36min
Cost/Performance Optimization with LLMs [Panel]
Sign up for the next LLM in production conference here: https://go.mlops.community/LLMinprod
Watch all the talks from the first conference: https://go.mlops.community/llmconfpart1
// Abstract
In this panel discussion, the topic of the cost of running large language models (LLMs) is explored, along with potential solutions. The benefits of bringing LLMs in-house, such as latency optimization and greater control, are also discussed. The panelists explore methods such as structured pruning and knowledge distillation for optimizing LLMs. OctoML's platform is mentioned as a tool for the automatic deployment of custom models and for selecting the most appropriate hardware for them. Overall, the discussion provides insights into the challenges of managing LLMs and potential strategies for overcoming them.
// Bio
Lina Weichbrodt
Lina is a pragmatic freelancer and machine learning consultant that likes to solve business problems end-to-end and make machine learning or a simple, fast heuristic work in the real world.
In her spare time, Lina likes to exchange with other people on how they can implement best practices in machine learning, talk to her at the Machine Learning Ops Slack: shorturl.at/swxIN.
Luis Ceze
Luis Ceze is Co-Founder and CEO of OctoML, which enables businesses to seamlessly deploy ML models to production making the most out of the hardware. OctoML is backed by Tiger Global, Addition, Amplify Partners, and Madrona Venture Group. Ceze is the Lazowska Professor in the Paul G. Allen School of Computer Science and Engineering at the University of Washington, where he has taught for 15 years.
Luis co-directs the Systems and Architectures for Machine Learning lab (sampl.ai), which co-authored Apache TVM, a leading open-source ML stack for performance and portability that is used in widely deployed AI applications.
Luis is also co-director of the Molecular Information Systems Lab (misl.bio), which led pioneering research in the intersection of computing and biology for IT applications such as DNA data storage. His research has been featured prominently in the media including New York Times, Popular Science, MIT Technology Review, and the Wall Street Journal. Ceze is a Venture Partner at Madrona Venture Group and leads their technical advisory board.
Jared Zoneraich
Co-Founder of PromptLayer, enabling data-driven prompt engineering. Compulsive builder. Jersey native, with a brief stint in California (UC Berkeley '20) and now residing in NYC.
Daniel Campos
Hailing from Mexico Daniel started his NLP journey with his BS in CS from RPI. He then worked at Microsoft on Ranking at Bing with LLM(back when they had 2 commas) and helped build out popular datasets like MSMARCO and TREC Deep Learning. While at Microsoft he got his MS in Computational Linguistics from the University of Washington with a focus on Curriculum Learning for Language Models. Most recently, he has been pursuing his Ph.D. at the University of Illinois Urbana Champaign focusing on efficient inference for LLMs and robust dense retrieval. During his Ph.D., he worked for companies like Neural Magic, Walmart, Qualtrics, and Mendel.AI and now works on bringing LLMs to search at Neeva.
Mario Kostelac
Currently building AI-powered products in Intercom in a small, highly effective team. I roam between practical research and engineering but lean more towards engineering and challenges around running reliable, safe, and predictable ML systems. You can imagine how fun it is in LLM era :).
Generally interested in the intersection of product and tech, and building a differentiation by solving hard challenges (technical or non-technical).
Software engineer turned into Machine Learning engineer 5 years ago.


