The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) cover image

Mixture-of-Experts and Trends in Large-Scale Language Modeling with Irwan Bello - #569

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

00:00

Exploring the Dynamics of Sparse Model Fine-Tuning in NLP

This chapter explores the latest research on the effectiveness of sparse models in natural language processing, focusing on pre-training and fine-tuning challenges. It highlights the differences in hyperparameters needed, the potential loss of gains during fine-tuning, and emphasizes the balance between model parameters and computational resources for optimal performance.

Play episode from 17:31
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app