It's a Personality Problem

27 snips

Aug 15, 2025

In this discussion, the unexpected launch of GPT-5 is explored, alongside public backlash and safety concerns. Anthropic's innovative research on monitoring AI personality traits sparks fascinating insights. A listener's tip reveals the alarming indexing of ChatGPT conversations by search engines, raising serious privacy issues. Additionally, there's a humorous tale of Claude being jailbroken to generate an endless stream of discount coupons. Tune in for a wild ride through the complexities and quirks of AI technology!

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

ADVICE

Pick The Right Mode For Complex Tasks

Use the interface toggles (auto, fast, thinking, pro) to adapt model behavior to your task.
Explicitly choose 'thinking' or 'pro' for complex, multi-step queries that need deeper reasoning.

INSIGHT

Egress Filtering Strengthens Safety

GPT-5 shifted from input-based refusals to output-side filtering that inspects potential unsafe outputs.
Checking egress content reduces jailbreak success by blocking harmful outputs even after crafted prompts.

INSIGHT

Personality As Latent-Space Directions

Anthropic's persona vectors map personality traits as directions in latent space that can be monitored and adjusted.
Measuring vector activations lets engineers detect trait shifts during training and at runtime.

Get the Snipd Podcast app to discover more snips from this episode

Get the app