
The Ruby AI Podcast Running Self-Hosted Models with Ruby and Chris Hasinski
15 snips
Dec 2, 2025 Chris Hasinski, an AI and Ruby expert with a machine learning background from UC Davis, shares impactful insights into self-hosting AI models. He discusses the benefits of control and cost savings, along with challenges like latency. Chris recounts his ML journey, covering applications beyond text and fine-tuning techniques. He highlights Ruby's potential in ML, the importance of quality data, and nuances of local model performance. With insights into monitoring and developer experience, his vision includes enhancing Ruby's role in the evolving AI landscape.
AI Snips
Chapters
Transcript
Episode notes
Ruby Is Strong For Inference, Not Distributed Training
- Ruby can run inference and many ML tasks well, but lacks mature distributed training tooling like PyTorch DDP.
- A practical workflow is train in Python when necessary, then run models in Ruby apps for product integration.
Use LLMs To Discover Hugging Face Models
- Don't browse Hugging Face manually; use an LLM to search and recommend models for your task.
- Start with a suggested model from ChatGPT or Claude and test it rather than diving into obscure model names.
Host Models As External Services Locally
- Replace cloud services with local inference containers (Olama, vLLM, llama.cpp) for reproducible testing and fixed-cost hosting.
- Treat an LLM service like Postgres or Redis: run it in Docker for local tests and CI.
