Voices of Search // A Search Engine Optimization (SEO) & Content Marketing Podcast

Blocking LLMs from proprietary data?

Apr 2, 2026

Kaspar Szymanski, senior director at SearchBrothers and ex-Google Search Team member, explains enterprise choices around LLM access and proprietary content. He outlines why truly proprietary data must be blocked and how public content becomes crawlable. He discusses binary crawlability decisions and governance tactics to prevent accidental data scraping.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ADVICE

Prevent Public Crawlability For Proprietary Data

Do prevent sensitive content from being publicly crawlable if you don't want it scraped by LLMs.
Kaspar Szymanski says truly proprietary data shouldn't be accessible to begin with because if it's crawlable by some bots it will likely leak.

INSIGHT

Public Access Is Practically Binary

Public accessibility is effectively binary: accessible content will be crawled and can leak to LLMs.
Kaspar frames the choice as binary: if content is crawlable by some bots, chances are it will ultimately be exposed.

Get the Snipd Podcast app to discover more snips from this episode

Get the app