Running Self-Hosted Models with Ruby and Chris Hasinski

15 snips

Dec 2, 2025

Chris Hasinski, an AI and Ruby expert with a machine learning background from UC Davis, shares impactful insights into self-hosting AI models. He discusses the benefits of control and cost savings, along with challenges like latency. Chris recounts his ML journey, covering applications beyond text and fine-tuning techniques. He highlights Ruby's potential in ML, the importance of quality data, and nuances of local model performance. With insights into monitoring and developer experience, his vision includes enhancing Ruby's role in the evolving AI landscape.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Ruby Is Strong For Inference, Not Distributed Training

Ruby can run inference and many ML tasks well, but lacks mature distributed training tooling like PyTorch DDP.
A practical workflow is train in Python when necessary, then run models in Ruby apps for product integration.

ADVICE

Use LLMs To Discover Hugging Face Models

Don't browse Hugging Face manually; use an LLM to search and recommend models for your task.
Start with a suggested model from ChatGPT or Claude and test it rather than diving into obscure model names.

ADVICE

Host Models As External Services Locally

Replace cloud services with local inference containers (Olama, vLLM, llama.cpp) for reproducible testing and fixed-cost hosting.
Treat an LLM service like Postgres or Redis: run it in Docker for local tests and CI.

Get the Snipd Podcast app to discover more snips from this episode

Get the app