
Episode 3: Notion AI Lead: Linus Lee
Thursday Nights in AI
00:00
Evaluating Performance and Advancements in Language Models
This chapter explores the challenges in evaluating text generation models, discussing correctness measurement, the need for human evaluation, and the use of unit tests and model-graded evaluations. It also highlights the importance of architecture beyond the transformer and touches upon the application of foundation models in scientific discovery.
Play episode from 28:18
Transcript


