LessWrong (Curated & Popular)

LessWrong

Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.

Episodes

Mentioned books

Jun 12, 2023 • 41min

"Mental Health and the Alignment Problem: A Compilation of Resources (updated April 2023)" by Chris Scammell & DivineMango

This is a post about mental health and disposition in relation to the alignment problem. It compiles a number of resources that address how to maintain wellbeing and direction when confronted with existential risk. Many people in this community have posted their emotional strategies for facing Doom after Eliezer Yudkowsky’s “Death With Dignity” generated so much conversation on the subject. This post intends to be more touchy-feely, dealing more directly with emotional landscapes than questions of timelines or probabilities of success. The resources section would benefit from community additions. Please suggest any resources that you would like to see added to this post.Please note that this document is not intended to replace professional medical or psychological help in any way. Many preexisting mental health conditions can be exacerbated by these conversations. If you are concerned that you may be experiencing a mental health crisis, please consult a professional.https://www.lesswrong.com/posts/pLLeGA7aGaJpgCkof/mental-health-and-the-alignment-problem-a-compilation-of

Apr 19, 2023 • 37min

"On AutoGPT" by Zvi

The primary talk of the AI world recently is about AI agents (whether or not it includes the question of whether we can’t help but notice we are all going to die.) The trigger for this was AutoGPT, now number one on GitHub, which allows you to turn GPT-4 (or GPT-3.5 for us clowns without proper access) into a prototype version of a self-directed agent. We also have a paper out this week where a simple virtual world was created, populated by LLMs that were wrapped in code designed to make them simple agents, and then several days of activity were simulated, during which the AI inhabitants interacted, formed and executed plans, and it all seemed like the beginnings of a living and dynamic world. Game version hopefully coming soon. How should we think about this? How worried should we be?https://www.lesswrong.com/posts/566kBoPi76t8KAkoD/on-autogpthttps://thezvi.wordpress.com/

Apr 12, 2023 • 6min

"GPTs are Predictors, not Imitators" by Eliezer Yudkowsky

Apr 5, 2023 • 17min

"A stylized dialogue on John Wentworth's claims about markets and optimization" by Nate Soares

https://www.lesswrong.com/posts/fJBTRa7m7KnCDdzG5/a-stylized-dialogue-on-john-wentworth-s-claims-about-markets(This is a stylized version of a real conversation, where the first part happened as part of a public debate between John Wentworth and Eliezer Yudkowsky, and the second part happened between John and me over the following morning. The below is combined, stylized, and written in my own voice throughout. The specific concrete examples in John's part of the dialog were produced by me. It's over a year old. Sorry for the lag.)(As to whether John agrees with this dialog, he said "there was not any point at which I thought my views were importantly misrepresented" when I asked him for comment.)

Apr 5, 2023 • 40min

"Discussion with Nate Soares on a key alignment difficulty" by Holden Karnofsky

https://www.lesswrong.com/posts/iy2o4nQj9DnQD7Yhj/discussion-with-nate-soares-on-a-key-alignment-difficultyCrossposted from the AI Alignment Forum. May contain more technical jargon than usual.In late 2022, Nate Soares gave some feedback on my Cold Takes series on AI risk (shared as drafts at that point), stating that I hadn't discussed what he sees as one of the key difficulties of AI alignment. I wanted to understand the difficulty he was pointing to, so the two of us had an extended Slack exchange, and I then wrote up a summary of the exchange that we iterated on until we were both reasonably happy with its characterization of the difficulty and our disagreement.1 My short summary is: Nate thinks there are deep reasons that training an AI to do needle-moving scientific research (including alignment) would be dangerous. The overwhelmingly likely result of such a training attempt (by default, i.e., in the absence of specific countermeasures that there are currently few ideas for) would be the AI taking on a dangerous degree of convergent instrumental subgoals while not internalizing important safety/corrigibility properties enough. I think this is possible, but much less likely than Nate thinks under at least some imaginable training processes.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

LessWrong (Curated & Popular)

Episodes

Mentioned books

"Unifying Bargaining Notions (2/2)" by Diffractor

"The ants and the grasshopper" by Richard Ngo

"Steering GPT-2-XL by adding an activation vector" by TurnTrout et al.