Agents Hour

Mastra
undefined
Apr 6, 2026 • 24min

Anthropic Leaked Their Own Source Code, OpenAI Raised $122b, and Axios Got Hacked (This Week In AI)

Shane and Abhi bring you your weekly roundup of AI news! Claude Code's entire source code leaked via an exposed .map file in npm — 512,000 lines of TypeScript, 50K GitHub stars before DMCAs started flying. What people found: Claude Code uses ~20 tools, and there's a regex that silently logs user frustration to analytics. Same week, a CMS misconfiguration exposed a draft blog post revealing Mythos and Capybara — a new model tier above Opus described as posing "unprecedented cybersecurity risks." Fortune separately confirmed a source saying Opus 5 is "so good it poses a danger." Claude Code auto mode shipped — a classifier between constant interrupts and the skip-permissions flag. Computer use landed in Claude Code too, letting it open apps and click through UI from the CLI. Rate limits were tightened during peak hours to community backlash. A federal judge blocked the Pentagon's attempt to label Anthropic a supply chain risk. A North Korea-linked group hijacked the npm account of Axios' lead maintainer and published malicious versions that stole env variables then cleaned up after themselves. With ~100M weekly downloads and Claude Code depending on Axios, the blast radius was significant. An Anthropic researcher also demoed Claude finding a zero-day in Ghost in 90 minutes. Agents are the new hackers, and the hackers have agents too. OpenAI closed $122B at an $852B valuation. Sora is shutting down. Mistral raised $830M for an NVIDIA-powered EU data center. Redpoint's 2026 market update argues this isn't the dot-com bubble, while noting agent maturity is early, and incumbents face a structural disadvantage against AI-native startups. Rapid fire: Gemini 3.1 Flash Live, Veo 3.1 Lite, pg-micro, Cloudflare runs Kimi K2.5, OpenCode remote sandboxes, Chroma 20B search agent, Cohere open-source transcription, Linear says issue tracking is dead, Microsoft M365 council mode, Mario Zechner's "Slow the fuck down," GLM-5.1, Google Translate live in headphones. AI Agents Hour is a weekly livestream by Mastra CPO Shane Thomas and CTO Abhi Aiyer. Mondays 12PM Pacific. 📚 READ MORE Claude Code leak: https://x.com/fried_rice/status/2038894956459290963 Frustration tracking: https://x.com/rahatcodes/status/2038995503141065145 Axios attack: https://x.com/mvxvvll/status/2038797094861918332 Claude zero-day: https://x.com/chiefofautism/status/2037951563931500669 OpenAI $122B: https://x.com/sawyermerritt/status/2039073153922539901 Sora shutdown: https://x.com/soraofficialapp/status/2036546752535470382 Auto mode: https://x.com/claudeai/status/2036503582166393240 Computer use in Code: https://x.com/claudeai/status/2038663014098899416 Mythos/Capybara: https://x.com/testingcatalog/status/2037394888577216617 Opus 5 danger: https://x.com/kimmonismus/status/2037461154088296748 Rate limits: https://x.com/trq212/status/2037254607001559305 Pentagon ruling: https://www.cnn.com/2026/03/26/business/anthropic-pentagon-injunction-supply-chain-risk Mistral $830M: https://x.com/ft/status/2038531872272040374 Redpoint market update: https://www.redpoint.com/reports/2026-market-update/ Gemini 3.1 Flash Live: https://x.com/officiallogank/status/2037187750005240307 pg-micro: https://x.com/glcst/status/2037254698898432278 OpenCode sandboxes: https://x.com/jlongster/status/2036924361379037224 Linear: https://x.com/linear/status/2036502198062821842 📚 MASTRA RESOURCES Mastra: https://mastra.ai Mastra on X: https://x.com/mastra_ai Mastra Discord: https://mastra.ai/community/discord Mastra GitHub: https://github.com/mastra-ai Learn Mastra in the world's first MCP-Based Course: https://mastra.ai/course Principles of Building AI Agents (Book): https://mastra.ai/books/principles-of-building-ai-agents Patterns for Building AI Agents (New Book): https://mastra.ai/books/patterns-of-building-ai-agents MASTRA? Mastra is an open-source TypeScript framework designed for building and shipping AI-powered applications and agents with minimal friction. It supports the full lifecycle of agent development—from prototype to production. You can integrate it with frontend and backend stacks (e.g., React, Next.js, Node) or run agents as standalone services. If you're a JavaScript or TypeScript developer looking to build an agentic or AI-powered product without starting from first principles, Mastra provides the scaffolding, tools, and integrations to accelerate that process. 00:00 — Claude Code source code leaked  04:30 — Axios supply chain attack 06:11 — Claude finds a zero-day in Ghost in 90 minutes 06:50 — OpenAI closes $122B round at $852B valuation 08:24 — Sora is shutting down 11:03 — Anthropic ships: auto mode & computer use in Claude Code 11:41 — Mythos & Capybara: Anthropic's next model tier leaked 14:35 — Claude rate limits tightened during peak hours 15:51 — Judge blocks Pentagon's supply chain risk label on Anthropic 16:08 — Mistral $830M & Redpoint's 2026 AI market update 20:45 — Rapid fire: Google, pg-micro, OpenCode, Chroma, Cohere & more
undefined
Mar 28, 2026 • 28min

Claude Uses Your Computer, Openai Buys Python Tools & The Cursor/Kimi Plot Twist (This Week In AI)

Shane and Abhi kick off with a viral quote: if your $500K engineer isn't burning $250K in tokens, something is wrong. OpenAI is acquiring Astral — the team behind uv and Ruff — joining the Codex team. OpenAI bets on Python; Anthropic bet on TypeScript with Bun. Then Cursor drama: someone found Composer 2 is powered by Kimi K2.5, Kimi confirmed it, and raised another $1B at an $18B valuation — three rounds in 90 days. Anthropic shipped Claude Code Channels (Telegram/Discord control), Cowork Dispatch (persistent agent, message from phone), and a deep dive on how they use Skills. Matt Pocock found quality drops past 100K on the 1M context window. And 52 million views on enabling Claude to use your computer — Mac only. Stripe launched MPP for agent-to-agent payments. Better Auth launched the Agent Auth Protocol. Cloudflare shipped Dynamic Workers for AI-generated code in isolates. LangChain open-sourced Deep Agents, Composio shipped 30-parallel-agent orchestration, OpenCode lost its Claude Max plugin after Anthropic sent lawyers, and Netlify and Google Stitch entered vibe coding and design. EsoLang-Bench: LLMs score 85–95% on standard benchmarks but collapse to 0–11% on esoteric languages — memorization, not reasoning. Quick hits: GPT-5.4 mini/nano, Minimax M2.7, Morph FlashCompact, AI CMO, Letta pivots to coding agents, GLM-OCR, LiteLLM supply chain attack. AI Agents Hour is a weekly livestream by Mastra CPO Shane Thomas and CTO Abhi Aiyer. Mondays 12PM Pacific. 📚 READ MORE $500K engineers: https://x.com/sundeep/status/2034829022082080846 OpenAI acquires Astral: https://openai.com/index/openai-to-acquire-astral/ Cursor Composer 2: https://x.com/cursor_ai/status/2034668943676244133 Composer 2 is Kimi K2.5: https://x.com/fynnso/status/2034706304875602030 Kimi confirms: https://x.com/kimi_moonshot/status/2035074972943831491 Kimi raises $1B: https://x.com/CodeByPoonam/status/2034940587942846665 Claude Code Channels: https://x.com/trq212/status/2034761016320696565 Cowork Dispatch: https://x.com/felixrieseberg/status/2034005731457044577 Anthropic Skills post: https://x.com/trq212/status/2033949937936085378 1M context quality: https://x.com/mattpocockuk/status/2034572011175907474 Claude computer use: https://x.com/claudeai/status/2034991044109184388 Stripe MPP: https://stripe.com/blog/machine-payments-protocol Agent Auth Protocol: https://github.com/better-auth/agent-auth-protocol Cloudflare Dynamic Workers: https://x.com/CloudflareDev/status/2034510221044736342 LangChain Deep Agents: https://x.com/hasantoxr/status/2033213054859792859 Composio Orchestrator: https://x.com/hasantoxr/status/2033999352008741376 OpenCode/Anthropic: https://x.com/thdxr/status/2034730036759339100 Netlify: https://x.com/Netlify/status/2034303709832773711 Google Stitch: https://stitch.withgoogle.com EsoLang-Bench: https://arxiv.org/abs/2603.09678 GPT-5.4 mini: https://x.com/openai/status/2033953592424731072 Morph FlashCompact: https://x.com/morphllm/status/2033968877345116200 📚 MASTRA RESOURCES Mastra: https://mastra.ai Mastra on X: https://x.com/mastra_ai Mastra Discord: https://mastra.ai/community/discord Mastra GitHub: https://github.com/mastra-ai Learn Mastra in the world's first MCP-Based Course: https://mastra.ai/course Principles of Building AI Agents (Book): https://mastra.ai/books/principles-of-building-ai-agents Patterns for Building AI Agents (New Book): https://mastra.ai/books/patterns-of-building-ai-agents WHAT IS MASTRA? Mastra is an open-source TypeScript framework designed for building and shipping AI-powered applications and agents with minimal friction. It supports the full lifecycle of agent development—from prototype to production. You can integrate it with frontend and backend stacks (e.g., React, Next.js, Node) or run agents as standalone services. If you're a JavaScript or TypeScript developer looking to build an agentic or AI-powered product without starting from first principles, Mastra provides the scaffolding, tools, and integrations to accelerate that process. 00:00 — If your $500K engineer isn't burning $250K in tokens, something is wrong 01:36 — OpenAI acquires Astral 02:31 — Cursor's Composer 2 is secretly Kimi K2.5 05:35 — Kimi raises another $1B 05:57 — Anthropic ships 08:00 — Opus 4.6 1M context: quality drops noticeably past 100K tokens 08:46 — Claude can now use your computer (Mac only) 11:02 — Stripe's Machine Payments Protocol 12:28 — Better Auth launches the Agent Auth Protocol 13:12 — Cloudflare Dynamic Workers & the vibe coding platform wave 14:08 — LangChain Deep Agents, Composio's 30-agent orchestrator & cloud coding agents 17:00 — OpenCode removes the Claude Max plugin 19:26 — Google Stitch & Netlify's prompt-to-project 19:59 — LLMs aren't reasoning, they're memorizing 21:09 — Quick hits: GPT-5.4 mini, Minimax M2.7, Morph FlashCompact, AI CMO 23:55 — Letta goes all-in on coding agents, GLM-OCR 24:46 — LiteLLM supply chain attack
undefined
Mar 25, 2026 • 21min

Email Broke Productivity - It's Time To Fix It (with Brett and Naveen from Micro)

Naveen Sreekandan, technical co‑founder and engineer who built Micro's graph-based architecture. Brett Goldstein, co‑founder driving product and demos of the unified productivity platform. They discuss transforming email into a connected workspace. They demo a daily orchestrator, CRM autofill, integrated meeting notes, and the underlying Mastra graph, Prism query layer, and agent/sub‑agent approach.
undefined
Mar 21, 2026 • 9min

Two Lines of Code to Lock Down Your Agents - Mastra Studio Auth

Mastra Studio started as a local playground for developers to test agents and workflows without having to spin up a custom UI. But as the feature set grew, teams started asking: how do we share this with non-technical teammates? How do we control what different users can do? Ryan, an engineer at Mastra, walks through the new Mastra Studio Auth — now baked directly into Studio. Starting with simple token-based auth (two lines of config), you can lock down your Studio from the open internet. From there, RBAC lets you map roles to granular permissions — 80 auto-generated permissions derived directly from Studio's routes and handlers, controllable via wildcard patterns. Out-of-the-box providers include WorkOS, Auth0, Supabase, Firebase, and Clerk, with GitHub and others in open PRs. The team also discusses what's coming next: audit logs so you can see exactly what an agent did, why it accessed a given tool, and whether it should have. Auth for agents in production isn't magic — your tool files still need to check permissions — but Mastra handles the plumbing so you can focus on building securely. Read more: https://mastra.ai/blog/announcing-studio-auth AI Agents Hour is a weekly livestream hosted by Mastra CPO Shane Thomas and CTO Abhi Aiyer. Airing Mondays at 12PM Pacific on YouTube and X, the show covers breaking AI news, agent development techniques, and features interviews with industry experts building AI applications today. 📚 MASTRA RESOURCES Mastra: https://mastra.ai Mastra on X: https://x.com/mastra_ai Mastra Discord: https://mastra.ai/community/discord Mastra GitHub: https://github.com/mastra-ai Learn Mastra in the world's first MCP-Based Course: https://mastra.ai/course Principles of Building AI Agents (Book): https://mastra.ai/books/principles-of-building-ai-agents Patterns for Building AI Agents (New Book): https://mastra.ai/books/patterns-of-building-ai-agents MASTRA? Mastra is an open-source TypeScript framework designed for building and shipping AI-powered applications and agents with minimal friction. It supports the full lifecycle of agent development—from prototype to production. You can integrate it with frontend and backend stacks (e.g., React, Next.js, Node) or run agents as standalone services. If you’re a JavaScript or TypeScript developer looking to build an agentic or AI-powered product without starting from first principles, Mastra provides the scaffolding, tools, and integrations to accelerate that process. 📌 CHAPTERS 00:00 — Why Mastra Studio needed auth 01:22 — Token-based auth: the simplest setup 02:32 — RBAC: roles, permissions & wildcards 05:00 — Auth for agents vs auth for humans 06:41 — Think securely! 07:22 — Supported providers & what's coming next
undefined
Mar 18, 2026 • 32min

NVIDIA GTC, The Death of MCP, and AI Agents Are Hiring Humans - This Week in AI

Shane hosts this week's news from his usual studio while Abhi joins remotely from NVIDIA GTC 2026 in San Jose. Jensen Huang's keynote set the tone: NVIDIA is doubling down on AI factories, pushing 100x more token throughput, and helping bring OpenAI onto AWS infrastructure. RentAHuman is a startup that lets AI agents hire humans for physical tasks they can't do themselves.  Perplexity's CTO said internally they're moving away from MCPs toward APIs and CLIs, but Chrome 146 shipping native MCP support may have undercut that argument immediately. Anthropic had a strong week: 1M context window is now GA for Opus 4.6 and Sonnet 4.6 with no beta header required, Opus 4.6 1M is now the default model for Claude Code on Max/Team/Enterprise with no long context price premium, and the new /btw command lets you have side conversations while Claude is working. Vercel and Cloudflare reignited their ongoing drama over the just-bash fork. Ramp launched credit cards for agents and Perplexity announced Personal Computer, an always-on local agent running on a Mac mini. Developer stack coverage includes Resend's open-source CLI with 53 commands, pnpm 11's git worktrees support for multi-agent monorepos, and OpenAI pushing a full computer environment behind the Responses API. Deeper reads from Sunil Pai on generative UI post-WIMP interfaces, Elliot Arledge on the RL environment business, and Jay Scambler's Autocontext harness. Quick hits: Replit Agent 4, Manus Desktop, NemoClaw from NVIDIA, llmock by CopilotKit, ContextKing raising to kill vector DBs, Google Maps getting Gemini, and Z.ai's GLM-5-Turbo optimized for Claude Code. AI Agents Hour is a weekly livestream hosted by Mastra CPO Shane Thomas and CTO Abhi Aiyer. Airing Mondays at 12PM Pacific on YouTube and X. 📚 READ MORE RentAHuman https://x.com/polymarket/status/2032470045217939723 Perplexity CTO moving away from MCPs https://x.com/morganlinton/status/2031795683897077965 Chrome 146 native MCP support https://x.com/xpasky/status/2032252486145253865 Claude 1M context window GA https://x.com/claudeai/status/2032509548297343196 Opus 4.6 1M default for Claude Code https://x.com/alexalbert__/status/2032522722551689363 Claude Code /btw command https://x.com/trq212/status/2031506296697131352 Vercel CTO on just-bash fork https://x.com/cramforce/status/2033285112478171373 Guillermo Rauch on Cloudflare https://x.com/rauchg/status/2033291143715455458 Ramp Agent Cards https://x.com/i/trending/2031832827063648342 Perplexity Personal Computer https://x.com/perplexity_ai/status/2031790180521427166 Resend CLI https://x.com/zenorocha/status/2032459310341800314 pnpm 11 git worktrees https://pnpm.io/11.x/git-worktrees OpenAI Responses API computer environment https://openai.com/index/equip-responses-api-computer-environment/ Sunil Pai — After WIMP https://sunilpai.dev/posts/after-wimp/ Elliot Arledge — The RL Environment Business https://x.com/elliotarledge/status/2032753593535574433 Autocontext — Jay Scambler https://x.com/JayScambler/status/2032508829959868690 Kimi Attention Residuals https://x.com/Kimi_Moonshot/status/2033378587878072424 Replit Agent 4 https://x.com/amasad/status/2031755113694679094 Manus Desktop https://x.com/ManusAI/status/2033558672152854712 NemoClaw — NVIDIA https://nemoclaw.so llmock by CopilotKit https://llmock.copilotkit.dev ContextKing — killing vector DBs https://x.com/contextkingceo/status/2032098309029220456 Google Maps biggest upgrade https://x.com/google/status/2032079594191261938 Z.ai GLM-5-Turbo https://x.com/Zai_org/status/2033221428640674015 📚 MASTRA RESOURCES Mastra: https://mastra.ai Mastra on X: https://x.com/mastra_ai Mastra Discord: https://mastra.ai/community/discord Mastra GitHub: https://github.com/mastra-ai Learn Mastra in the world's first MCP-Based Course: https://mastra.ai/course Principles of Building AI Agents (Book): https://mastra.ai/books/principles-of-building-ai-agents Patterns for Building AI Agents (New Book): https://mastra.ai/books/patterns-of-building-ai-agents MASTRA? Mastra is an open-source TypeScript framework designed for building and shipping AI-powered applications and agents with minimal friction. It supports the full lifecycle of agent development—from prototype to production. You can integrate it with frontend and backend stacks (e.g., React, Next.js, Node) or run agents as standalone services. If you’re a JavaScript or TypeScript developer looking to build an agentic or AI-powered product without starting from first principles, Mastra provides the scaffolding, tools, and integrations to accelerate that process. 00:00 — NVIDIA GTC 2026 04:16 — RentAHuman 07:56 — Is MCP dead? 13:08 — Anthropic ships 14:59 — Vercel vs Cloudflare: the just-bash fork drama 18:59 — Ramp Agent Cards & Perplexity Personal Computer 20:52 — Resend CLI, pnpm worktrees, OpenAI Responses API 23:04 — Developer insights
undefined
Mar 13, 2026 • 25min

Meta Acquires Moltbook, Openai Releases GPT 5.4, TypeScript Is #1 on GitHub (This Week In AI)

A lot happened in eight days.  Meta acquired Moltbook, a social network built entirely for AI agents, not humans.  OpenAI dropped GPT-5.4 Thinking and GPT-5.4 Pro, Codex got forks for multi-agent workflows and Windows support, and there are rumblings of OpenAI building a GitHub alternative. Anthropic fired back hard — multi-agent PR code review for Claude Code, while loops via /loop, the Claude Marketplace, and a way to pull your context from other AI tools. Plus: voice mode for CLI coding is apparently real, and people are using it. This episode also covers the explosion of coding agents: Theo's T3 Code, OpenAI's Symphony orchestration layer, OpenCode workspaces, and swyx's thesis that this is the Year of the Subagent. Donald Knuth is making headlines for being impressed by Opus 4.6, solving a long-standing math conjecture.  TypeScript is overtaking Python and JavaScript on GitHub. Gemini 3.1 Flash-Lite drops. AMI raises $1B.  OpenClaw is getting government-backed adoption in China. Stanford's paper on RAG breaking at 10K documents, Karpathy's autoresearch project, Justin Poehnelt on why your CLI needs to be rewritten for agents. Plus: Raycast Glaze, Google Workspace CLI, Copilot Cowork, Exa Deep, Expo Agent, and a discussion on whether humans should be reviewing code at all. AI Agents Hour is a weekly livestream hosted by Mastra CPO Shane Thomas and CTO Abhi Aiyer. Airing Mondays at 12PM Pacific on YouTube and X, the show covers breaking AI news, agent development techniques, and features interviews with industry experts building AI applications today. 📚 MASTRA RESOURCES Mastra: https://mastra.ai Mastra on X: https://x.com/mastra_ai Mastra Discord: https://mastra.ai/community/discord Mastra GitHub: https://github.com/mastra-ai Learn Mastra in the world's first MCP-Based Course: https://mastra.ai/course Principles of Building AI Agents (Book): https://mastra.ai/books/principles-of-building-ai-agents Patterns for Building AI Agents (New Book): https://mastra.ai/books/patterns-of-building-ai-agents MASTRA? Mastra is an open-source TypeScript framework designed for building and shipping AI-powered applications and agents with minimal friction. It supports the full lifecycle of agent development—from prototype to production. You can integrate it with frontend and backend stacks (e.g., React, Next.js, Node) or run agents as standalone services. If you’re a JavaScript or TypeScript developer looking to build an agentic or AI-powered product without starting from first principles, Mastra provides the scaffolding, tools, and integrations to accelerate that process. CHAPTERS 00:00 — Meta acquires Moltbook 01:41 — OpenAI updates 05:47 — Claude fights back: 09:05 — The coding agent explosion 11:28 — Donald Knuth 12:15 — TypeScript #1 on GitHub 13:05 — AI products and platforms  15:02 — Model releases and fuding 15:50 —OpenClaw goes mainstream in China 16:43 — Developer insights: RAG collapse, Karpathy's autoresearch & rewriting CLIs for agents 20:29 — Quick hits & is it time to kill the code review?
undefined
Mar 11, 2026 • 15min

The Biggest Threat to AI Agents (with Ismail Pelaseyed)

Ismail Pelaseyed from Superagent is back on Agents Hour, and this time he's talking about something most builders aren't thinking about yet — supply chain attacks on AI agents. Guardrails protect against what you tell your agent to do. But what about everything your agent reads, fetches, and installs on its own? That's the gap Brin is built to fill. Brin is a free, open-source credit score for agent context. Before your agent acts on an external package, MCP server, skill, or web page, Brin scores it — identity, behavior, and content — and returns a verdict in under 10ms. No signup, no auth, one GET request. Ismail walks through how supply chain attacks actually work in production, the three-tier scoring model behind Brin, how the Cline NPM incident illustrates exactly this problem, and why securing the context — not the agent — is the right mental model. AI Agents Hour is a weekly livestream hosted by Mastra CPO Shane Thomas and CTO Abhi Aiyer. Airing Mondays at 12PM Pacific on YouTube and X, the show covers breaking AI news, agent development techniques, and features interviews with industry experts building AI applications today. 🔗 CHECK OUT BRIN  Brin: https://brin.sh/ Brin docs: https://brin.sh/docs Brin GitHub: https://github.com/superagent-ai/brin Superagent: https://superagent.sh Superagent on X: https://x.com/superagent_ai https://x.com/pelaseyed 📚 MASTRA RESOURCES Mastra: https://mastra.ai Mastra on X: https://x.com/mastra_ai Mastra Discord: https://mastra.ai/community/discord Mastra GitHub: https://github.com/mastra-ai Learn Mastra in the world's first MCP-Based Course: https://mastra.ai/course Principles of Building AI Agents (Book): https://mastra.ai/books/principles-of-building-ai-agents Patterns for Building AI Agents (New Book): https://mastra.ai/books/patterns-of-building-ai-agents MASTRA? Mastra is an open-source TypeScript framework designed for building and shipping AI-powered applications and agents with minimal friction. It supports the full lifecycle of agent development—from prototype to production. You can integrate it with frontend and backend stacks (e.g., React, Next.js, Node) or run agents as standalone services. If you’re a JavaScript or TypeScript developer looking to build an agentic or AI-powered product without starting from first principles, Mastra provides the scaffolding, tools, and integrations to accelerate that process. 00:00 — From guardrails to supply chain attacks 03:32 — Introducing Brin: a credit score for agent context 05:14 — How to integrate Brin into your agent 07:17 — The three-tier scoring model 10:50 — What's next for agent security
undefined
Mar 4, 2026 • 34min

Missile Strikes Disrupt AWS and Claude, Anthropic Banned from US Government, Cloudflare vs Vercel

This week in AI saw geopolitical turmoil, major funding news, and a shift in software development. Missile strikes in the UAE and Bahrain disrupted AWS and Claude services. Meanwhile, after Anthropic banned its models from autonomous weapons and mass surveillance, the Trump administration banned Anthropic from government contracts—posing a major supply chain risk. On the same day, Sam Altman secured a deal with the Department of War as OpenAI announced a $110 billion funding round, highlighting a sharp contrast in approaches. AI coding is evolving rapidly. Andrej Karpathy noted that coding agents, ineffective before December, now work well with improved quality and coherence. Yet, perfect accuracy remains elusive. New tools emerged: Cursor demos code, Linear markets itself as an AI coding assistant, and Perplexity Computer offers an all-in-one system for managing AI projects. Smaller models like Qwen 3.5 grow faster and more efficient for edge use. Other highlights include Anthropic acquiring Vercept AI, Claude's remote coding controls, and Stanford confirming major AI firms use user conversations to train models. AI Agents Hour is a weekly livestream hosted by Mastra CPO Shane Thomas and CTO Abhi Aiyer. Airing Mondays at 12PM Pacific on YouTube and X, the show covers breaking AI news, agent development techniques, and features interviews with industry experts building AI applications today. 📚 MASTRA RESOURCES Mastra: https://mastra.ai Learn Mastra in the world's first MCP-Based Course: https://mastra.ai/course Principles of Building AI Agents (Book): https://mastra.ai/book Patterns for Building AI Agents (New Book): https://mastra.ai/blog/patterns-book https://docs.google.com/forms/d/e/1FAIpQLSduJjc515f6RZJqtkR2ByqJZrB0iP8B7SUKnjjZE9IajH_I8w/viewform MASTRA? Mastra is an open-source TypeScript framework designed for building and shipping AI-powered applications and agents with minimal friction. It supports the full lifecycle of agent development—from prototype to production. You can integrate it with frontend and backend stacks (e.g., React, Next.js, Node) or run agents as standalone services. If you’re a JavaScript or TypeScript developer looking to build an agentic or AI-powered product without starting from first principles, Mastra provides the scaffolding, tools, and integrations to accelerate that process. CHAPTERS 00:00 – Intro 00:25 – AWS Data Center Strikes & Claude Outages 01:20 – Anthropic Ban 05:30 – Sam Altman's Government Deal 10:05 – Cloudflare/Vercel Drama & NPM Namespaces 11:09 – Block Cuts 40% of Staff (4,000 People) 12:30 – AI & Job Market: Fear vs Reality 18:27 – OpenAI Raises $110B, Codex Growth 18:52 – Claude Releases: Vercept Acquisition, Remote Control, Auto Memory 20:13 – State of AI Coding  26:10 – AI Products and Platforms 28:42 – Open Source Models 31:30 – Quick Hits & GitHub Star Party
undefined
Mar 1, 2026 • 17min

How to Build Reliable AI Agents with Datasets, Experiments, and Error Analysis

Yujohn from Mastra explains why datasets and experiments are essential for building production-grade AI agents. If you're building an agent, you need a way to verify it's working correctly before and after you make changes. Datasets provide that baseline. You create a collection of test cases (ground truth) that represent the scenarios your agent should handle. Then you run experiments: pass each test case through your agent and measure the results. This is error analysis in practice. You start by identifying where your agent fails, then build scorers to quantify those failure modes over time. Smaller teams often ship first and add datasets later, once they have user feedback. Larger teams need them earlier. But eventually, every production agent needs this. The demo shows how Mastra makes this accessible. You can create datasets through the UI, add items manually or import from CSV, and run experiments with a single click. The results show you exactly what went wrong: which tool calls failed, what the agent output was, and how it compared to ground truth. You can also compare experiments side by side to see if your prompt tweaks actually improved things. And because all the data lives in your own database, you can write your own agents to analyze the results, dig into traces, and iterate. The SDK makes it easy to integrate into CI/CD: run experiments on pull requests, gate deployments on eval scores, or just collect data from production and curate datasets later. 🔗 RESOURCES Mastra Datasets docs: https://mastra.ai/docs/observability/datasets Running Experiments: https://mastra.ai/docs/observability/datasets/running-experiments Mastra GitHub: https://github.com/mastra-ai/mastra Yujohn on X: https://x.com/YujohnNatt Mastra Discord: https://discord.gg/mastra AI Agents Hour is a weekly livestream hosted by Mastra CPO Shane Thomas and CTO Abhi Aiyer. Airing Mondays at 12PM Pacific on YouTube and X, the show covers breaking AI news, agent development techniques, and features interviews with industry experts building AI applications today. 📚 MASTRA RESOURCES Mastra: https://mastra.ai Learn Mastra in the world's first MCP-Based Course: https://mastra.ai/course Principles of Building AI Agents (Book): https://mastra.ai/book Patterns for Building AI Agents (New Book): https://mastra.ai/books/patterns-of-building-ai-agents MASTRA? Mastra is an open-source TypeScript framework designed for building and shipping AI-powered applications and agents with minimal friction. It supports the full lifecycle of agent development—from prototype to production. You can integrate it with frontend and backend stacks (e.g., React, Next.js, Node) or run agents as standalone services. If you’re a JavaScript or TypeScript developer looking to build an agentic or AI-powered product without starting from first principles, Mastra provides the scaffolding, tools, and integrations to accelerate that process. 00:00 – Intro 00:48 – What are Datasets and Experiments 01:55 – Error Analysis 03:35 – When to Use Datasets (Team Size Matters ) 05:43 – Demo: Creating a Dataset 07:04 – Demo: Ground Truth 07:53 – Demo: Running Experiments 09:34 – Demo: Comparing Results 11:00 – Your Data, Your Database 12:24 – SDK & CI Integration 14:30 – Collecting Data from Production
undefined
Feb 27, 2026 • 15min

A Coding Agent That Never Compacts

Abhi walks through Mastra Code, a new open-source coding agent with observational memory that compresses context without losing it. When we built Observational Memory, we needed a way to test it in production. Instead of a standard bot, we dogfooded it: we built a coding agent and used it ourselves. Writing code every day quickly revealed whether the memory actually worked. Eventually, something clicked. Long, multi-day coding sessions ran smoothly, without hitting the compaction limits that trip up other agents. The agent actually seemed to understand the conversation. That gave us confidence to release Observational Memory... and the agent itself also  became something worth sharing. The core innovation is the Harness primitive. Rather than just a coding agent, Harness can power any type of agentic workflow: customer support, design tools, electrical engineering, and more. In this demo, you’ll see the Mastra Code CLI, a production-ready coding agent, and how Corbin from Artifact used the Harness to create an in-app agent for electrical engineering. Same primitive, completely different use case. 🔗 RESOURCES Mastra Code announcement: https://mastra.ai/blog/announcing-mastra-code Mastra Code GitHub: https://github.com/mastra-ai/mastra Mastra Code NPM: https://www.npmjs.com/package/mastra-code Observational Memory: https://mastra.ai/blog/observational-memory Mastra documentation: https://docs.mastra.ai AI Agents Hour is a weekly livestream hosted by Mastra CPO Shane Thomas and CTO Abhi Aiyer. Airing Mondays at 12PM Pacific on YouTube and X, the show covers breaking AI news, agent development techniques, and features interviews with industry experts building AI applications today. 📚 MASTRA RESOURCES Mastra: https://mastra.ai Learn Mastra in the world's first MCP-Based Course: https://mastra.ai/course Principles of Building AI Agents (Book): https://mastra.ai/book Patterns for Building AI Agents (New Book): https://mastra.ai/blog/patterns-book https://docs.google.com/forms/d/e/1FAIpQLSduJjc515f6RZJqtkR2ByqJZrB0iP8B7SUKnjjZE9IajH_I8w/viewform MASTRA? Mastra is an open-source TypeScript framework designed for building and shipping AI-powered applications and agents with minimal friction. It supports the full lifecycle of agent development—from prototype to production. You can integrate it with frontend and backend stacks (e.g., React, Next.js, Node) or run agents as standalone services. If you’re a JavaScript or TypeScript developer looking to build an agentic or AI-powered product without starting from first principles, Mastra provides the scaffolding, tools, and integrations to accelerate that process. CHAPTERS 00:00 – Intro & The Origin Story 02:09 – The Trend of Building Your Own Claude Code 03:11 – Demo 08:28 – No More Compaction 09:35 – Demo: Harness Primitive + Electrical Engineering Tool 14:08 – Getting Started

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app