
Constitutional AI Harmlessness from AI Feedback
BlueDot Narrated
00:00
Can models supervise other models?
Evaluation showing large LMs can identify helpful, honest, and harmless responses with high accuracy.
Play episode from 22:03
Transcript

Evaluation showing large LMs can identify helpful, honest, and harmless responses with high accuracy.