The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Ensuring Privacy for Any LLM with Patricia Thaine - #716

71 snips

Jan 28, 2025

Patricia Thaine, co-founder and CEO of Private AI, specializes in privacy-preserving AI techniques. She dives into the critical issues of data minimization, the risks of personal data leakage from large language models (LLMs), and the challenges of redacting sensitive information across different formats. Patricia highlights the limitations of data anonymization, the balance between real and synthetic data for model training, and the evolving landscape of AI regulations like GDPR. She also discusses the ethical considerations surrounding bias in AI and the future of privacy in technology.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Internal Model Risks

Internally hosted models still pose data breach risks.
Data minimization limits the impact of potential breaches.

INSIGHT

Data Leakage

Models can memorize and leak training data, including from embeddings.
Embeddings, even if not directly invertible, can leak sensitive information like salaries.

INSIGHT

Embedding Leakage

Embeddings can leak data both through direct attacks and through model outputs.
Sensitive information in embedding databases can be revealed in model responses.

Get the Snipd Podcast app to discover more snips from this episode

Get the app