The Infra Pod

Coding agents need infra to apply code changes! (Chat with Tejas from Morph)

12 snips
Feb 9, 2026
Tejas Bhakta, CEO and co-founder of Morph, builds ultra-fast file-edit APIs and subagent infra for coding agents. He explains how Morph hits 10,000 tokens/sec with speculative decoding. The chat covers fast apply vs search-and-replace, subagent architecture and SDKs, code-specific semantic search, and a vision for autonomous software that updates itself.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Fast Apply Uses Lazy Edits And Speculation

  • Fast Apply outputs lazy edits and uses a second model to merge changes, avoiding brittle search-and-replace formats.
  • Speculative decoding plus using the original code as a prior yields large speedups for edits.
INSIGHT

Speculative Decoding Powers Extreme Throughput

  • Morph achieves ~10,000 tokens/sec by using the original code as a prior and speculative decoding tuned to this niche task.
  • They build a task-specific inference engine and kernel tuning rather than using generic chat models.
ADVICE

Optimize For Wall‑Clock Latency

  • Prioritize wall-clock latency for Vibe-coding platforms because faster responses increase conversion.
  • Optimize for speed without adding inaccuracy to boost user engagement and conversion rates.
Get the Snipd Podcast app to discover more snips from this episode
Get the app