Voices of Search // A Search Engine Optimization (SEO) & Content Marketing Podcast

Blocking LLMs from proprietary data?

Apr 2, 2026
Kaspar Szymanski, senior director at SearchBrothers and ex-Google Search Team member, explains enterprise choices around LLM access and proprietary content. He outlines why truly proprietary data must be blocked and how public content becomes crawlable. He discusses binary crawlability decisions and governance tactics to prevent accidental data scraping.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ADVICE

Prevent Public Crawlability For Proprietary Data

  • Do prevent sensitive content from being publicly crawlable if you don't want it scraped by LLMs.
  • Kaspar Szymanski says truly proprietary data shouldn't be accessible to begin with because if it's crawlable by some bots it will likely leak.
INSIGHT

Public Access Is Practically Binary

  • Public accessibility is effectively binary: accessible content will be crawled and can leak to LLMs.
  • Kaspar frames the choice as binary: if content is crawlable by some bots, chances are it will ultimately be exposed.
Get the Snipd Podcast app to discover more snips from this episode
Get the app