Super Data Science: ML & AI Podcast with Jon Krohn

692: Lossless LLM Weight Compression: Run Huge Models on a Single GPU

Jun 30, 2023
Discover the SPQR approach for lossless LLM weight compression, enabling running large models on a single GPU. Learn about QLora, combining low-rank adaptation and quantization for better performance
Ask episode
Chapters
Transcript
Episode notes