
Gradient Dissent: Conversations on AI Providing Greater Access to LLMs with Brandon Duderstadt, Co-Founder and CEO of Nomic AI
35 snips
Jul 27, 2023 AI Snips
Chapters
Transcript
Episode notes
Small Models Win On Cost And Privacy
- Smaller models will dominate many production roles because they offer better cost and privacy tradeoffs.
- Domain-specific models plus careful data curation beat brute-force large models for many tasks.
Fine-Tune Efficiently With QLoRA
- Use QLoRA or adapter methods to fine-tune efficiently on compressed base models.
- Prefer cheap adapter training before full-weight fine-tuning to cut compute costs dramatically.
Quantization Trades Few Bits For Big Gains
- Quantization compresses model weights (e.g., 4-bit, 8-bit) to drastically reduce memory and compute.
- Many intermediate quantizations keep most performance while enabling orders-of-magnitude speedups.
