
Mixture-of-Experts and Trends in Large-Scale Language Modeling with Irwan Bello - #569
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
00:00
Innovations in Language Modeling
This chapter explores the application of a novel modeling technique, originally for natural language processing, to diverse tasks in deep learning, including vision. The speakers address the complexities of implementing the mixture of experts approach and the integration with established frameworks like BERT or GPT. They also highlight the transformative potential of sparse models and retrieval systems in enhancing predictive capabilities and model performance, particularly in large-scale applications.
Play episode from 21:52
Transcript


