
Are AI agents ready for the workplace? A new benchmark raises doubts.; plus, Ring is adding a new content verification feature
TechCrunch Industry News
00:00
Benchmark results: best models still fall short
Unknown Speaker summarizes scores: Gemini 3 Flash and GPT-5.2 lead but top accuracy remains around 24%.
Play episode from 04:09
Transcript


