AI and "Do No Harm"
JAMA+ AI Conversations
00:00
Motivation for a 'Do No Harm' benchmark
David outlines the need for scalable evaluation of LLMs using a harm‑focused benchmark and live leaderboard.
Play episode from 02:14
Transcript
David outlines the need for scalable evaluation of LLMs using a harm‑focused benchmark and live leaderboard.