
The FAIK Files It's a Personality Problem
27 snips
Aug 15, 2025 In this discussion, the unexpected launch of GPT-5 is explored, alongside public backlash and safety concerns. Anthropic's innovative research on monitoring AI personality traits sparks fascinating insights. A listener's tip reveals the alarming indexing of ChatGPT conversations by search engines, raising serious privacy issues. Additionally, there's a humorous tale of Claude being jailbroken to generate an endless stream of discount coupons. Tune in for a wild ride through the complexities and quirks of AI technology!
AI Snips
Chapters
Books
Transcript
Episode notes
Pick The Right Mode For Complex Tasks
- Use the interface toggles (auto, fast, thinking, pro) to adapt model behavior to your task.
- Explicitly choose 'thinking' or 'pro' for complex, multi-step queries that need deeper reasoning.
Egress Filtering Strengthens Safety
- GPT-5 shifted from input-based refusals to output-side filtering that inspects potential unsafe outputs.
- Checking egress content reduces jailbreak success by blocking harmful outputs even after crafted prompts.
Personality As Latent-Space Directions
- Anthropic's persona vectors map personality traits as directions in latent space that can be monitored and adjusted.
- Measuring vector activations lets engineers detect trait shifts during training and at runtime.



