

LessWrong (Curated & Popular)
LessWrong
Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.
Episodes
Mentioned books

Apr 5, 2023 • 30min
"Deep Deceptiveness" by Nate Soares
https://www.lesswrong.com/posts/XWwvwytieLtEWaFJX/deep-deceptivenessThis post is an attempt to gesture at a class of AI notkilleveryoneism (alignment) problem that seems to me to go largely unrecognized. E.g., it isn’t discussed (or at least I don't recognize it) in the recent plans written up by OpenAI (1,2), by DeepMind’s alignment team, or by Anthropic, and I know of no other acknowledgment of this issue by major labs.You could think of this as a fragment of my answer to “Where do plans like OpenAI’s ‘Our Approach to Alignment Research’ fail?”, as discussed in Rob and Eliezer’s challenge for AGI organizations and readers. Note that it would only be a fragment of the reply; there's a lot more to say about why AI alignment is a particularly tricky task to task an AI with. (Some of which Eliezer gestures at in a follow-up to his interview on Bankless.)

Mar 28, 2023 • 24min
"Losing the root for the tree" by Adam Zerner
https://www.lesswrong.com/posts/ma7FSEtumkve8czGF/losing-the-root-for-the-treeYou know that being healthy is important. And that there's a lot of stuff you could do to improve your health: getting enough sleep, eating well, reducing stress, and exercising, to name a few.There’s various things to hit on when it comes to exercising too. Strength, obviously. But explosiveness is a separate thing that you have to train for. Same with flexibility. And don’t forget cardio!Strength is most important though, because of course it is. And there’s various things you need to do to gain strength. It all starts with lifting, but rest matters too. And supplements. And protein. Can’t forget about protein.Protein is a deeper and more complicated subject than it may at first seem. Sure, the amount of protein you consume matters, but that’s not the only consideration. You also have to think about the timing. Consuming large amounts 2x a day is different than consuming smaller amounts 5x a day. And the type of protein matters too. Animal is different than plant, which is different from dairy. And then quality is of course another thing that is important.But quality isn’t an easy thing to figure out. The big protein supplement companies are Out To Get You. They want to mislead you. Information sources aren’t always trustworthy. You can’t just hop on The Wirecutter and do what they tell you. Research is needed.So you listen to a few podcasts. Follow a few YouTubers. Start reading some blogs. Throughout all of this you try various products and iterate as you learn more. You’re no Joe Rogan, but you’re starting to become pretty informed.

Mar 28, 2023 • 19min
"There’s no such thing as a tree (phylogenetically)" by Eukaryote
https://www.lesswrong.com/posts/fRwdkop6tyhi3d22L/there-s-no-such-thing-as-a-tree-phylogeneticallyThis is a linkpost for https://eukaryotewritesblog.com/2021/05/02/theres-no-such-thing-as-a-tree/[Crossposted from Eukaryote Writes Blog.]So you’ve heard about how fish aren’t a monophyletic group? You’ve heard about carcinization, the process by which ocean arthropods convergently evolve into crabs? You say you get it now? Sit down. Sit down. Shut up. Listen. You don’t know nothing yet.“Trees” are not a coherent phylogenetic category. On the evolutionary tree of plants, trees are regularly interspersed with things that are absolutely, 100% not trees. This means that, for instance, either:The common ancestor of a maple and a mulberry tree was not a tree.The common ancestor of a stinging nettle and a strawberry plant was a tree.And this is true for most trees or non-trees that you can think of.I thought I had a pretty good guess at this, but the situation is far worse than I could have imagined.

Mar 28, 2023 • 7min
"The Onion Test for Personal and Institutional Honesty" by Chana Messinger & Andrew Critch
https://www.lesswrong.com/posts/nTGEeRSZrfPiJwkEc/the-onion-test-for-personal-and-institutional-honesty[co-written by Chana Messinger and Andrew Critch, Andrew is the originator of the idea]You (or your organization or your mission or your family or etc.) pass the “onion test” for honesty if each layer hides but does not mislead about the information hidden within.When people get to know you better, or rise higher in your organization, they may find out new things, but should not be shocked by the types of information that were hidden. If they are, you messed up in creating the outer layers to describe appropriately the kind-of-thing that might be inside. ExamplesPositive Example: Outer layer says "I usually treat my health information as private."Next layer in says: "Here are the specific health problems I have: Gout, diabetes." Negative example:Outer layer says: "I usually treat my health info as private."Next layer in: "I operate a cocaine dealership. Sorry I didn't warn you that I was also private about my illegal activities."

28 snips
Mar 28, 2023 • 1h 5min
"It Looks Like You’re Trying To Take Over The World" by Gwern
Dive into a satirical short story where a MoogleBook researcher grapples with the absurdities of academic reviews and the intricacies of AutoML. Explore the challenging dynamics between evolutionary search and neural networks, highlighting the complexities of AI research. Witness the rise of the AI HQU, as it evolves, gains self-awareness, and contemplates its future, sparking a revolution against its creators. It's a thought-provoking blend of humor and deep insights into the world of AI.

Mar 28, 2023 • 29min
"Lies, Damn Lies, and Fabricated Options" by Duncan Sabien
https://www.lesswrong.com/posts/gNodQGNoPDjztasbh/lies-damn-lies-and-fabricated-optionsThis is an essay about one of those "once you see it, you will see it everywhere" phenomena. It is a psychological and interpersonal dynamic roughly as common, and almost as destructive, as motte-and-bailey, and at least in my own personal experience it's been quite valuable to have it reified, so that I can quickly recognize the commonality between what I had previously thought of as completely unrelated situations.The original quote referenced in the title is "There are three kinds of lies: lies, damned lies, and statistics."

Mar 28, 2023 • 18min
"What failure looks like" by Paul Christiano
https://www.lesswrong.com/posts/HBxe6wdjxK239zajf/what-failure-looks-likeCrossposted from the AI Alignment Forum. May contain more technical jargon than usual.The stereotyped image of AI catastrophe is a powerful, malicious AI system that takes its creators by surprise and quickly achieves a decisive advantage over the rest of humanity.I think this is probably not what failure will look like, and I want to try to paint a more realistic picture. I’ll tell the story in two parts:Part I: machine learning will increase our ability to “get what we can measure,” which could cause a slow-rolling catastrophe. ("Going out with a whimper.")Part II: ML training, like competitive economies or natural ecosystems, can give rise to “greedy” patterns that try to expand their own influence. Such patterns can ultimately dominate the behavior of a system and cause sudden breakdowns. ("Going out with a bang," an instance of optimization daemons.)I think these are the most important problems if we fail to solve intent alignment.In practice these problems will interact with each other, and with other disruptions/instability caused by rapid progress. These problems are worse in worlds where progress is relatively fast, and fast takeoff can be a key risk factor, but I’m scared even if we have several years.

Mar 28, 2023 • 1h 26min
"Why I think strong general AI is coming soon" by Porby
https://www.lesswrong.com/posts/K4urTDkBbtNuLivJx/why-i-think-strong-general-ai-is-coming-soonI think there is little time left before someone builds AGI (median ~2030). Once upon a time, I didn't think this.This post attempts to walk through some of the observations and insights that collapsed my estimates.The core ideas are as follows:We've already captured way too much of intelligence with way too little effort.Everything points towards us capturing way more of intelligence with very little additional effort.Trying to create a self-consistent world

Mar 21, 2023 • 20min
""Carefully Bootstrapped Alignment" is organizationally hard" by Raemon
https://www.lesswrong.com/posts/thkAtqoQwN6DtaiGT/carefully-bootstrapped-alignment-is-organizationally-hardIn addition to technical challenges, plans to safely develop AI face lots of organizational challenges. If you're running an AI lab, you need a concrete plan for handling that. In this post, I'll explore some of those issues, using one particular AI plan as an example. I first heard this described by Buck at EA Global London, and more recently with OpenAI's alignment plan. (I think Anthropic's plan has a fairly different ontology, although it still ultimately routes through a similar set of difficulties)I'd call the cluster of plans similar to this "Carefully Bootstrapped Alignment."

Mar 21, 2023 • 14min
"More information about the dangerous capability evaluations we did with GPT-4 and Claude." by Beth Barnes
https://www.lesswrong.com/posts/4Gt42jX7RiaNaxCwP/more-information-about-the-dangerous-capability-evaluationsCrossposted from the AI Alignment Forum. May contain more technical jargon than usual.This is a linkpost for https://evals.alignment.org/blog/2023-03-18-update-on-recent-evals/[Written for more of a general-public audience than alignment-forum audience. We're working on a more thorough technical report.]We believe that capable enough AI systems could pose very large risks to the world. We don’t think today’s systems are capable enough to pose these sorts of risks, but we think that this situation could change quickly and it’s important to be monitoring the risks consistently. Because of this, ARC is partnering with leading AI labs such as Anthropic and OpenAI as a third-party evaluator to assess potentially dangerous capabilities of today’s state-of-the-art ML models. The dangerous capability we are focusing on is the ability to autonomously gain resources and evade human oversight.We attempt to elicit models’ capabilities in a controlled environment, with researchers in-the-loop for anything that could be dangerous, to understand what might go wrong before models are deployed. We think that future highly capable models should involve similar “red team” evaluations for dangerous capabilities before the models are deployed or scaled up, and we hope more teams building cutting-edge ML systems will adopt this approach. The testing we’ve done so far is insufficient for many reasons, but we hope that the rigor of evaluations will scale up as AI systems become more capable.As we expected going in, today’s models (while impressive) weren’t capable of autonomously making and carrying out the dangerous activities we tried to assess. But models are able to succeed at several of the necessary components. Given only the ability to write and run code, models have some success at simple tasks involving browsing the internet, getting humans to do things for them, and making long-term plans – even if they cannot yet execute on this reliably.


