Vasilije Markovich, founder of Cognee and former data engineer turned cognitive-science-informed entrepreneur, builds agentic memory and knowledge-layer systems. He discusses permanent vs session memory, graph+vector architectures, storage and latency trade-offs, metadata and decay strategies, trace-based scoring, multi-tenant isolation, and practical vertical uses like pharma, logistics, and security.
57:47
forum Ask episode
web_stories AI Snips
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
volunteer_activism ADVICE
Use Hot Session Memory For Low Latency Responses
Use session (hot) memory for low-latency agent interactions and sync to permanent memory when needed to avoid slow retrievals.
Cognee stores transformed embedding triplets in Redis for fast search and syncs to the permanent store for quality.
volunteer_activism ADVICE
Start With Files And Upgrade Only When Needed
Start simple: use prompt templating or MD files for small projects and only move to vector/graph stores when you need complex relationships or reconciliation.
LensDB, Qdrant, Milvus, Neo4j and others are good next steps; LensDB can back up to S3 for fast iteration.
insights INSIGHT
Separate Graph Metadata From Embedding Content
Memory design splits into metadata/graph filters and embedding content; agents often operate mainly on embeddings with the graph storing source and relationships.
Cognee uses node sets, timestamps, and post-processing to add edges and enable traversals across concepts.
Get the Snipd Podcast app to discover more snips from this episode
Summary In this episode of the Data Engineering Podcast, Vasilije "Vas" Markovich, founder of Cognee, discusses building agentic memory, a crucial aspect of artificial intelligence that enables systems to learn, adapt, and retain knowledge over time. He explains the concept of agentic memory, highlighting the importance of distinguishing between permanent and session memory, graph+vector layers, latency trade-offs, and multi-tenant isolation to ensure safe knowledge sharing or protection. The conversation covers practical considerations such as storage choices (Redis, Qdrant, LanceDB, Neo4j), metadata design, temporal relevance and decay, and emerging research areas like trace-based scoring and reinforcement learning for improving retrieval. Vas shares real-world examples of agentic memory in action, including applications in pharma hypothesis discovery, logistics control towers, and cybersecurity feeds, as well as scenarios where simpler approaches may suffice. He also offers guidance on when to add memory, pitfalls to avoid (naive summarization, uncontrolled fine-tuning), human-in-the-loop realities, and Cognee's future plans: revamped session/long-term stores, decision-trace research, and richer time and transformation mechanisms. Additionally, Vas touches on policy guardrails for agent actions and the potential for more efficient "pseudo-languages" for multi-agent collaboration.
Announcements
Hello and welcome to the Data Engineering Podcast, the show about modern data management
If you lead a data team, you know this pain: Every department needs dashboards, reports, custom views, and they all come to you. So you're either the bottleneck slowing everyone down, or you're spending all your time building one-off tools instead of doing actual data work. Retool gives you a way to break that cycle. Their platform lets people build custom apps on your company data—while keeping it all secure. Type a prompt like 'Build me a self-service reporting tool that lets teams query customer metrics from Databricks—and they get a production-ready app with the permissions and governance built in. They can self-serve, and you get your time back. It's data democratization without the chaos. Check out Retool at dataengineeringpodcast.com/retool today and see how other data teams are scaling self-service. Because let's be honest—we all need to Retool how we handle data requests.
Your host is Tobias Macey and today I'm interviewing Vasilije Markovic about agentic memory architectures and applications
Interview
Introduction
How did you get involved in the area of data management?
Can you start by giving an overview of the different elements of "memory" in an agentic context?
storage and retrieval mechanisms
how to model memories
how does that change as you go from short-term to long-term?
managing scope and retrieval triggers
What are some of the useful triggers in an agent architecture to identify whether/when/what to create a new memory?
How do things change as you try to build a shared corpus of memory across agents?
What are the most interesting, innovative, or unexpected ways that you have seen agentic memory used?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on Cognee?
When is a dedicated memory layer the wrong choice?
What do you have planned for the future of Cognee?
From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.