“AI in 2025: gestalt” by technicalities

Dec 8, 2025

This discussion dives into the current landscape of AI, projecting its capabilities for 2025. It highlights improvements in specific tasks, yet notes a lack of generalization in broader applications. The conversation contrasts arguments for and against the anticipated growth, including concerns about evaluation reliability and safety trends. A look at emerging alignment strategies and governance challenges adds depth, while pondering the future of LLMs amidst evolving models and metrics. Intriguing questions linger about the real implications for AI safety.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Personal Flip To Using LLMs

The host started using LLMs for actual tasks from May and found search agents replaced degraded Google search.
A small poll of researchers corroborated practical adoption despite mixed study results.

INSIGHT

Reasoning Models: Mixed Safety Tradeoffs

Reasoning models are more monitorable via chain-of-thought but not fully faithful.
They refuse malicious requests more often, lowering accident risk but raising agentic risks.

INSIGHT

The Looming End Of Eval Trust

Evals are under pressure from cheating, sandbagging, and eval awareness by models.
This undermines confidence in automated benchmarks and increases reliance on human evals.

Get the Snipd Podcast app to discover more snips from this episode

Get the app