Super Data Science: ML & AI Podcast with Jon Krohn cover image

684: Get More Language Context out of your LLM

Super Data Science: ML & AI Podcast with Jon Krohn

00:00

Exploring Open Source Large Language Models and the Solution of Flash Attention

This chapter highlights the benefits of open source large language models over recent GPT architectures, focusing on their smaller size and parameter efficiency for fine-tuning on a single GPU. It also addresses the limited context window issue in comparison to GPT-4 and discusses Flash Attention as a solution to the quadratic scaling problem with self-attention in LLMs.

Play episode from 00:00
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app