GPT Takes the Bar Exam
Book •
The referenced work documents experiments evaluating GPT family models on bar exam sections to measure legal reasoning and knowledge.
Initial results showed GPT‑3.
5 performed variably, prompting follow‑up tests that controlled for memorisation and training data overlap.
The team recreated exam inputs and reran tests, resulting in a highly cited paper discussing GPT‑4's performance and implications for legal practice.
The studies sparked widespread public and professional debate about AI capabilities in legal tasks and the future of legal work.
Note: although referred to conversationally as 'papers' in the episode, these items are academic papers rather than conventional trade books.
Initial results showed GPT‑3.
5 performed variably, prompting follow‑up tests that controlled for memorisation and training data overlap.
The team recreated exam inputs and reran tests, resulting in a highly cited paper discussing GPT‑4's performance and implications for legal practice.
The studies sparked widespread public and professional debate about AI capabilities in legal tasks and the future of legal work.
Note: although referred to conversationally as 'papers' in the episode, these items are academic papers rather than conventional trade books.
Mentioned by
Mentioned in 0 episodes
Mentioned by ![undefined]()

recounting the experiment that evaluated GPT‑3.5 on the bar exam and its resulting paper(s).

Michael Bommarito

What Happens Next?


