Reasoning, Robustness, and Human Feedback in AI - Max Bartolo (Cohere)

374 snips

Mar 18, 2025

Max Bartolo, a researcher at Cohere, dives into the world of machine learning, focusing on model reasoning and robustness. He highlights the DynaBench platform's role in dynamic benchmarking and the complex challenges of evaluating AI performance. The conversation reveals the limitations of human feedback in training AI and the surprising reliance on distributed knowledge. Bartolo discusses the impact of adversarial examples on model reliability and emphasizes the need for tailored approaches to enhance AI systems, ensuring they align with human values.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

ANECDOTE

Prioritizing Style over Factuality

Humans prioritize style and formatting over factuality in model output.
This can lead to optimizing models for appealing but less accurate completions.

INSIGHT

Personalized Model Behavior

Personalized model behavior, tailored to individual preferences, is a compelling goal.
This can be achieved by conditioning models on user-specific data, avoiding retraining for each user.

INSIGHT

PRISM Project and User Background

The PRISM project explores how demographics, culture, and other attributes influence preferences.
Even simple interaction patterns, like conversation topics, vary significantly based on background.

Get the Snipd Podcast app to discover more snips from this episode

Get the app