
Super Data Science: ML & AI Podcast with Jon Krohn 767: Open-Source LLM Libraries and Techniques, with Dr. Sebastian Raschka
11 snips
Mar 19, 2024 Dr. Sebastian Raschka, Author of Machine Learning Q and AI, talks about PyTorch Lightning, LLM development opportunities, DoRA vs LoRA, and being a successful AI educator in a fascinating discussion with Jon Krohn.
AI Snips
Chapters
Transcript
Episode notes
Fine-Tuning Gives Faster Research Iterations Than Pre-Training
- Fine-tuning delivers much faster researcher feedback loops than pre-training, making it a pragmatic focus for many teams.
- Parameter-efficient methods like LoRA let you adapt large models in days using only millions of added parameters.
DORA Makes LoRA More Parameter Efficient
- LoRA approximates weight updates as low-rank A×B, vastly reducing tunable parameters; DORA adds weight normalization to decouple magnitude and direction.
- DORA often matches or beats LoRA with smaller rank (e.g., rank 4 vs 8), cutting fine-tune parameters roughly in half.
Store Adapters Not Full Models For Client Customization
- Keep a single base model and store small LoRA/DORA adapter weights per client to avoid duplicating full-model storage.
- This approach prevents explosion of storage when serving multiple customized variants of a 7B base model.

