The Ruby AI Podcast

Running Self-Hosted Models with Ruby and Chris Hasinski

15 snips
Dec 2, 2025
Chris Hasinski, an AI and Ruby expert with a machine learning background from UC Davis, shares impactful insights into self-hosting AI models. He discusses the benefits of control and cost savings, along with challenges like latency. Chris recounts his ML journey, covering applications beyond text and fine-tuning techniques. He highlights Ruby's potential in ML, the importance of quality data, and nuances of local model performance. With insights into monitoring and developer experience, his vision includes enhancing Ruby's role in the evolving AI landscape.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Ruby Is Strong For Inference, Not Distributed Training

  • Ruby can run inference and many ML tasks well, but lacks mature distributed training tooling like PyTorch DDP.
  • A practical workflow is train in Python when necessary, then run models in Ruby apps for product integration.
ADVICE

Use LLMs To Discover Hugging Face Models

  • Don't browse Hugging Face manually; use an LLM to search and recommend models for your task.
  • Start with a suggested model from ChatGPT or Claude and test it rather than diving into obscure model names.
ADVICE

Host Models As External Services Locally

  • Replace cloud services with local inference containers (Olama, vLLM, llama.cpp) for reproducible testing and fixed-cost hosting.
  • Treat an LLM service like Postgres or Redis: run it in Docker for local tests and CI.
Get the Snipd Podcast app to discover more snips from this episode
Get the app