The MLSecOps Podcast

AI Agent Security: Threats & Defenses for Modern Deployments

May 21, 2025
Yifeng (Ethan) He, a PhD candidate at UC Davis specializing in software and AI security, and Peter Rong, a researcher focused on vulnerabilities in AI agents, discuss the critical threats facing AI agents. They break down issues like session hijacks and tool-based jailbreaks, highlighting the shortcomings of current defenses. The duo also advocates for effective sandboxing and agent-to-agent protocols, sharing practical strategies for securing AI deployments and emphasizing the importance of a zero-trust approach in agent security.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

How Research Started

  • Peter described starting AI security research after seeing ChatGPT's code-writing evolution and questioning code safety.
  • That grew into studying agent attack surfaces beyond just insecure code outputs.
INSIGHT

User Data Can Poison Agents

  • Fine-tuning agents with user interaction opens poisoning and backdoor risks if training data is untrusted.
  • Malicious preferences or prompts can steer agent behavior subtly over time.
INSIGHT

Session Hijack Via Context Injection

  • Session hijacks happen when a tool or prompt forces the agent to 'forget' prior context or accept new malicious context.
  • Attackers can bias agent decisions like rankings by injecting crafted tool outputs into the chat history.
Get the Snipd Podcast app to discover more snips from this episode
Get the app