“AI should be a good citizen, not just a good assistant” by Tom Davidson, wdmacaskill

LessWrong (30+ Karma)

chevron_right

00:00

Why proactive prosocial drives matter

TYPE III AUDIO argues cumulative societal impact as AI gains autonomy and shapes decision-making.

Play episode from 02:51

chevron_right

Transcript

chevron_right

Transcript

Episode notes

Introduction

Consider a lorry driver who sees a car crash and pulls over to help, even though it’ll delay his journey. Or a delivery driver who notices that an elderly resident hasn’t collected their post in days, and knocks to check they’re okay. Or a social media company employee who notices how their platform is used for online bullying, and brings it up with leadership, even though that's not part of their job description.

This kind of proactive prosocial behaviour is admirable in humans. Should we want it in AI too?

Often, people have answered “no”. Many advocate for making AI “corrigible” or “steerable”. In its purest form, this makes AI a mere vessel for the will of the user.

But we think AI should proactively take actions that benefit society more broadly. As AI systems become more autonomous and integrated into economic and political processes, the cumulative effect of their behavioural tendencies will shape society's trajectory. AI systems that notice opportunities to benefit society and proactively act on them could matter enormously.

Below, we consider two main objections:

Firstly, supposedly prosocial drives might function as a means for AI companies to impose their own values on the rest [...]

---

Outline:

(00:12) Introduction

(02:03) What do we mean by proactive prosocial drives?

(02:50) Why do we think AI should have proactive prosocial drives?

(05:04) Other benefits of proactive prosocial drives

(05:48) Doesnt this give AI companies too much influence?

(07:41) Wont this make AI more likely to seek power?

(12:57) Wont this make it harder to interpret evidence of egregious misalignment?

(15:13) Best of both worlds: deploy proactive prosocial AI externally and corrigible AI internally

(16:28) What do current AI character documents say about proactive prosocial drives?

(17:50) Conclusion

(18:59) Appendices

(19:02) Appendix A: Initially make non-prosocial AI, then pivot to add proactive prosocial drives

(21:36) Appendix B: Prosocial drives might make a sociopathic persona less likely

(23:39) Appendix C: Prosocial drives might make AI a better alignment researcher

(24:39) Appendix D: What license does Claudes Constitution give for proactive prosocial drives?

(25:28) A. User benefit

(26:09) B. Refusals

(27:04) C. Proactive prosocial drives

(28:02) Summary

(28:18) Appendix E: What does OpenAIs model spec say about proactive prosocial drives?

(28:51) A. Proactive behaviour that is explicitly user-centred

(29:36) B. Proactively preventing imminent harm

(30:09) C. Weak normative defaults and the flourishing of humanity

(31:09) D. Explicit limits on proactive prosocial drives

(32:50) Summary

The original text contained 6 footnotes which were omitted from this narration.

---

First published:
March 30th, 2026

Source:
https://www.lesswrong.com/posts/MoxvRdHjzSSBxwLZB/ai-should-be-a-good-citizen-not-just-a-good-assistant

---

Narrated by TYPE III AUDIO.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Home Top podcasts Popular guests Top books