Coding agents need infra to apply code changes! (Chat with Tejas from Morph)

12 snips

Feb 9, 2026

Tejas Bhakta, CEO and co-founder of Morph, builds ultra-fast file-edit APIs and subagent infra for coding agents. He explains how Morph hits 10,000 tokens/sec with speculative decoding. The chat covers fast apply vs search-and-replace, subagent architecture and SDKs, code-specific semantic search, and a vision for autonomous software that updates itself.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Fast Apply Uses Lazy Edits And Speculation

Fast Apply outputs lazy edits and uses a second model to merge changes, avoiding brittle search-and-replace formats.
Speculative decoding plus using the original code as a prior yields large speedups for edits.

INSIGHT

Speculative Decoding Powers Extreme Throughput

Morph achieves ~10,000 tokens/sec by using the original code as a prior and speculative decoding tuned to this niche task.
They build a task-specific inference engine and kernel tuning rather than using generic chat models.

ADVICE

Optimize For Wall‑Clock Latency

Prioritize wall-clock latency for Vibe-coding platforms because faster responses increase conversion.
Optimize for speed without adding inaccuracy to boost user engagement and conversion rates.

Get the Snipd Podcast app to discover more snips from this episode