The Nonlinear Library

The Nonlinear Fund

The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

Episodes

Mentioned books

Jan 30, 2024 • 2min

LW - Things You're Allowed to Do: At the Dentist by rbinnn

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Things You're Allowed to Do: At the Dentist, published by rbinnn on January 30, 2024 on LessWrong. Inspired by Milan Cvitkovic's article, Things You're Allowed to Do. Going to the dentist can be uncomfortable. Some amount of this is unavoidable. Yet most dentists and staff care a lot about patient comfort. Tell them what you need, and you may very well get it! The hardest part is figuring out what's on the menu. Below are some items that I've discovered. Caveats: available options may vary a lot by dentist options sometimes have tradeoffs, which you should discuss with your dentist You can control the suction Every time I go to the dentist, there is a segment featuring a water hose, a suction hose, and a third or fourth bonus tool in my mouth. I find this uncomfortable for many reasons: the suction hose is badly positioned and water is accumulating the suction hose hits the back of my throat and I gag and cough I am nervously anticipating any of the above Luckily, if I ask, I can hold the suction hose myself. I can position it exactly as needed for my comfort. Dentists seem to like this too since it frees up one of their hands. You can get smaller x-ray films I have a small jaw and find the bitewing x-rays to be super large and uncomfortable. Sometimes they make me gag, and that usually means I am more likely to gag on the next try. I don't like it. It turns out that my dentist has smaller bitewings on hand. They are designed for children but work for adults too, and I find them to be much more comfortable. The main downside is that they might make it a bit harder for the dentist to get the specific images they want. You can refuse polish I don't like the feeling of having my teeth polished, and often the sickly artificial flavour gives me a headache afterward*. Usually the stains on my teeth are mild and have already been removed during scaling**. So do I really need to have my teeth polished? * I find that flavourless polish also helps here. ** Some people loathe scaling and tolerate polishing. Maybe you can trade more of one for less of the other? Misc. roundup ask for different painkiller options to get something more personally effective or less aversive (e.g. needles) decline painkillers to save time during mild procedures ask for water or a tissue at any time ask to pause for a minute decline the crappy free toothbrush they give you at the end ask for a free brushhead that works with the electric toothbrush you use at home Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Jan 30, 2024 • 9min

AF - The case for more ambitious language model evals by Arun Jose

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The case for more ambitious language model evals, published by Arun Jose on January 30, 2024 on The AI Alignment Forum. Here are some capabilities that I expect to be pretty hard to discover using an RLHF'd chat LLM: Eric Drexler tried to use the GPT-4 base model as a writing assistant, and it [...] knew who he was from what he was writing. He tried to simulate a conversation to have the AI help him with some writing he was working on, and the AI simulacrum repeatedly insisted it was by Drexler. A somewhat well-known Haskell programmer - let's call her Alice - wrote two draft paragraphs of a blog post she wanted to write, began prompting the base model with it, and after about two iterations it generated a link to her draft blog post repo with her name. More generally, this is a cluster of capabilities that could be described as language models inferring a surprising amount about the data-generation process that produced its prompt, such as the identity, personality, intentions, or history of a user[1]. The reason I expect most capability evals people currently run on language models to miss out on most abilities like these is primarily that they're most naturally observed when dealing with much more open-ended contexts. For instance, continuing text as the user, predicting an assistant free to do things that could superficially look like hallucinations[2], and so on. Most evaluation mechanisms people use today involve testing the ability of fine-tuned[3] models to perform a broad array number of specified tasks in some specified contexts, with or without some scaffolding - a setting that doesn't lend itself very well toward the kind of contexts I describe above. A pretty reasonable question to ask at this point is why it matters at all whether we can detect these capabilities. A position one could have here is that there are capabilities much more salient to various takeover scenarios that are more useful to try and detect, such as the ability to phish people, hack into secure accounts, or fine-tune other models. From that perspective, evals trying to identify capabilities like these are just far less important. Another pretty reasonable position is that these particular instances of capabilities just don't seem very impressive, and are basically what you would expect out of language models. My response to the first would be that I think it's important to ask what we're actually trying to achieve with our model eval mechanisms. Broadly, I think there are two different (and very often overlapping) things we would want our capability evals[4] to be doing: Understanding whether or not a specific model is possessed of some dangerous capabilities, or prone to acting in a malicious way in some context. Giving us information to better forecast the capabilities of future models. In other words, constructing good scaling laws for our capability evals. I'm much more excited about the latter kind of capability evals, and most of my case here is directed at that. Specifically, I think that if you want to forecast what future models will be good at, then by default you're operating in a regime where you have to account for a bunch of different emergent capabilities that don't necessarily look identical to what you've already seen. Even if you really only care about a specific narrow band of capabilities that you expect to be very likely convergent to takeover scenarios - an expectation I don't really buy as something you can very safely assume because of the uncertainty and plurality of takeover scenarios - there is still more than one way in which you can accomplish some subtasks, some of which may only show up in more powerful models. As a concrete example, consider the task of phishing someone on the internet. One straightforward way to achieve this would be to figure out how...

Jan 29, 2024 • 10min

EA - EV investigation into Owen and Community Health by EV US Board

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: EV investigation into Owen and Community Health, published by EV US Board on January 29, 2024 on The Effective Altruism Forum. Introduction Last year, Owen Cotton-Barratt resigned from EV UK's board of directors following reports of sexual misconduct. Prior to his resignation, accusations of misconduct from Owen had been reported to Julia Wise at CEA's Community Health team, which is led by Nicole Ross. EV US and EV UK jointly commissioned an independent investigation led by the law firm Herbert Smith Freehills into Owen's conduct and whether the Community Health team had acted appropriately with the information they had been given. Following the investigation, the boards of EV US[1] and EV UK jointly deliberated over the findings and the appropriate response. Below, the EV boards report their determinations and actions. We considered saying nothing or sharing significantly less information but decided it was in the best interests of the community to have some information upon which to update on the behavior of Owen Cotton-Barratt and the Community Health team. Our desire for transparency was not particularly motivated by the magnitude of the findings, and was instead motivated by the relevancy of the information for informing community members' future interactions with Owen and / or Community Health, the public nature of Owen's resignation, and community norms towards transparency and accountability. Additionally, we felt that sharing as much information as we could was particularly important because of the recent news that EV's projects are spinning out, as the boards' decisions only have an effect for projects so long as they remain part of EV. Projects will eventually set their own policies and won't have access to all of the facts we do, so we wanted to provide some information to enable the broader EA ecosystem to make better-informed decisions. With that being said, we are constrained in how much detail we can share without risking the anonymity of the interviewees. The investigators noted that multiple interviewees made requests to protect their anonymity, and given their voluntary participation, we want to respect their wishes. We want people to continue to feel comfortable coming forward in investigations knowing that potentially identifying information will not be made public. This means that in some cases below we present claims and board actions without all of the underlying evidence or reasoning. We recognize that this post does not have the same level of reasoning transparency we would normally aim for and think readers should update less than they would if they had as much detail as we do, but we ultimately felt like this was a reasonable middle ground to strike to allow us to share as much information with the community as possible while protecting the anonymity of interviewees. appendix below. Determinations regarding Owen Cotton-Barratt The boards unanimously agree on the following: On multiple occasions, Owen expressed sexual and / or romantic interest in women who were younger and less influential than he was. There were important power differentials between Owen and the women involved, sometimes formal and sometimes informal. Multiple women expressed being upset by Owen's advances. Both the frequency and the content of the advances contributed to the women's feelings. Julia Wise from CEA's Community Health Team gave Owen feedback that his behavior was inappropriate prior to some of the later instances of similar behavior. Owen was inconsistent at acknowledging potential conflicts of interest with persons whom he expressed sexual and / or romantic interest in. He recused himself in at least one professional context, but did not seem to consistently acknowledge other potential conflicts in other instances. In at least one case, Owen did not stop m...

Jan 29, 2024 • 4min

LW - Processor clock speeds are not how fast AIs think by Ege Erdil

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Processor clock speeds are not how fast AIs think, published by Ege Erdil on January 29, 2024 on LessWrong. I often encounter some confusion about whether the fact that synapses in the brain typically fire at frequencies of 1-100 Hz while the clock frequency of a state-of-the-art GPU is on the order of 1 GHz means that AIs think "many orders of magnitude faster" than humans. In this short post, I'll argue that this way of thinking about "cognitive speed" is quite misleading. The clock speed of a GPU is indeed meaningful: there is a clock inside the GPU that provides some signal that's periodic at a frequency of ~ 1 GHz. However, the corresponding period of ~ 1 nanosecond does not correspond to the timescale of any useful computations done by the GPU. For instance; in the A100 any read/write access into the L1 cache happens every ~ 30 clock cycles and this number goes up to 200-350 clock cycles for the L2 cache. The result of these latencies adding up along with other sources of delay such as kernel setup overhead etc. The timescale for a single matrix multiplication gets longer if we also demand that the matrix multiplication achieves something close to the peak FLOP/s performance reported in the GPU datasheet. In the plot above, it can be seen that a matrix multiplication achieving good hardware utilization can't take shorter than ~ 100 microseconds or so. On top of this, state-of-the-art machine learning models today consist of chaining many matrix multiplications and nonlinearities in a row. For example, a typical language model could have on the order of ~ 100 layers with each layer containing at least 2 serial matrix multiplications for the feedforward layers[1]. If these were the only places where a forward pass incurred time delays, we would obtain the result that a sequential forward pass cannot occur faster than (100 microseconds/matmul) * (200 matmuls) = 20 ms or so. At this speed, we could generate 50 sequential tokens per second, which is not too far from human reading speed. This is why you haven't seen LLMs being serviced at per token latencies that are much faster than this. We can, of course, process many requests at once in these 20 milliseconds: the bound is not that we can generate only 50 tokens per second, but that we can generate only 50 sequential tokens per second, meaning that the generation of each token needs to know what all the previously generated tokens were. It's much easier to handle requests in parallel, but that has little to do with the "clock speed" of GPUs and much more to do with their FLOP/s capacity. The human brain is estimated to do the computational equivalent of around 1e15 FLOP/s. This performance is on par with NVIDIA's latest machine learning GPU (the H100) and the brain achieves this performance using only 20 W of power compared to the 700 W that's drawn by an H100. In addition, each forward pass of a state-of-the-art language model today likely takes somewhere between 1e11 and 1e12 FLOP, so the computational capacity of the brain alone is sufficient to run inference on these models at speeds of 1k to 10k tokens per second. There's, in short, no meaningful sense in which machine learning models today think faster than humans do, though they are certainly much more effective at parallel tasks because we can run them on clusters of multiple GPUs. In general, I think it's more sensible for discussion of cognitive capabilities to focus on throughput metrics such as training compute (units of FLOP) and inference compute (units of FLOP/token or FLOP/s). If all the AIs in the world are doing orders of magnitude more arithmetic operations per second than all the humans in the world (8e9 people * 1e15 FLOP/s/person = 8e24 FLOP/s is a big number!) we have a good case for saying that the cognition of AIs has become faster than t...

Jan 29, 2024 • 3min

EA - What roles do GWWC and Founders Pledge play in growing the EA donor pool? by BrownHairedEevee

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What roles do GWWC and Founders Pledge play in growing the EA donor pool?, published by BrownHairedEevee on January 29, 2024 on The Effective Altruism Forum. I'm curious about the roles of Giving What We Can and Founders Pledge in driving donations to EA causes. Both organizations play a similar role in persuading people to give money to highly effective causes, but it seems to me that GWWC focuses on getting a large number of average-income people to donate relatively small amounts of money, whereas Founders Pledge focuses on getting a smaller number of entrepreneurs to donate large amounts of money. I think that both types of movement growth are important in different ways. On the one hand, having even a small number of large donors means we have a lot of funding, which allows us to make a great impact. (Even with the collapse of FTX in 2022, there is still a chance that the EA movement could have more billionaire backers by 2027 than it does now.) On the other hand, a large number of donors means there are a large number of individuals engaging in the philosophy and practice of EA, which helps spread the ideas of EA and demonstrate its accessibility. What are your opinions on how GWWC and FP's roles in generating movement growth compare and contrast? Which kind of movement growth is more important for the EA movement right now? Appendix: Relevant statistics GWWC and FP's membership numbers: GWWC has over 9,400 individuals with active pledges as of January 28, 2024[1] Founders Pledge has 1,767 members as of 2022, their latest impact report[2] Amounts pledged: FP: "$1.3 billion pledged to charity from 80 new members"[2] GWWC estimates that $83 million of lifetime value will be generated from the new pledges taken in 2020-2022 (a three year period), or an average of $27 million of lifetime value from new pledges per year.[3] Giving multiplier: GWWC estimates that it generates $30 for every $1 invested in its operations. I couldn't find an estimate of FP's giving multiplier effect, but I think it would be useful for comparison. ^ Our members - GWWC ^ 2022 Impact Report - Founders Pledge ^ 2020-2022 Impact evaluation - GWWC Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Jan 29, 2024 • 6min

LW - Why I take short timelines seriously by NicholasKees

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why I take short timelines seriously, published by NicholasKees on January 29, 2024 on LessWrong. I originally started writing this as a message to a friend, to offer my personal timeline takes. It ended up getting kind of long, so I decided to pivot toward making this into a post. These are my personal impressions gathered while doing a bachelors and a masters degree in artificial intelligence, as well as working for about a year and a half in the alignment space. AI (and AI alignment) has been the center of my attention for a little over 8 years now. For most of that time, if you asked me about timelines I'd gesture at an FHI survey that suggested a median timeline of 2045-2050, and say "good chance it happens in my lifetime." When I thought about my future in AI safety, I imagined that I'd do a PhD, become a serious academic, and by the time we were getting close to general intelligence I would already have a long tenure of working in AI (and be well placed to help). I also imagined that building AI would involve developing a real "science of intelligence," and I saw the work that people at my university (University of Groningen) were doing as pursuing this great project. People there were working on a wide range of machine learning methods (of which neural networks were just one idea), logic, knowledge systems, theory of mind, psychology, robotics, linguistics, social choice, argumentation theory, etc. I heard very often that "neural networks are not magic," and was encouraged to embrace an interdisciplinary approach to understanding how intelligence worked (which I did). At the time, there was one big event that caused a lot of controversy: the success of AlphaGo (2016). To a lot of people, including myself, this seemed like "artificial intuition." People were not very impressed with the success of DeepBlue in chess, because this was "just brute force" and this would obviously not scale. Real intelligence was about doing more than brute force. AlphaGo was clearly very different, though everyone disagreed on what the implications were. Many of my professors bet really hard against deep learning continuing to succeed, but over and over again they were proven wrong. In particular I remember OpenAI Five (2017/2018) as being an extremely big deal in my circles, and people were starting to look at OpenAI as potentially changing everything. There was this other idea that I embraced, which was something adjacent to Moravec's paradox: AI would be good at the things humans are bad at, and vice versa. It would first learn to do a range of specialized tasks (which would be individually very impressive), gradually move toward more human-like systems, and the very last thing it would learn to do was master human language. This particular idea about language has been around since the Turing test: Mastering language would require general, human-level intelligence. If you had told me there would be powerful language models in less than a decade, I would have been quite skeptical. When GPT happened, this dramatically changed my future plans. GPT-2 and especially GPT-3 were both extremely unnerving to me (though mostly exciting to all my peers). This was, in my view "mastering language" which was not supposed to happen until we were very close to human level demonstrating general abilities. I can't overstate how big of a deal this was. GPT-2 could correctly use newly invented words, do some basic math, and a wide range of unusual things that we now call in-context learning. There was nothing even remotely close to this anywhere else in AI, and people around me struggled to understand how this was even possible. a result of scaling. When GPT-3 came out, this was especially scary, because they hadn't really done anything to improve upon the design of GPT-2, they just made it bigger....

Jan 28, 2024 • 2min

LW - Palworld development blog post by bhauth

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Palworld development blog post, published by bhauth on January 28, 2024 on LessWrong. Palworld is currently the most-played game on Steam. It's by a small Japanese company called Pocketpair. Shortly before Palworld released, the CEO put up this blog post; here's a translation. Here are some points I thought were interesting: on the co-founder: It took me three years to quit JP Morgan, where I joined as a new graduate. He quit after only a month. The more talented people are, the sooner they leave the company. on one of the animators: Looking for someone in Japan who had experience with guns in games, he looked on twitter and found someone posting gun reloading animations. Contacting this person, it turned out they were a 20-year-old high school dropout working part-time at a convenience store in Hokkaido. Pocketpair hired him as a remote employee for a month, then asked him to come to Tokyo. His parents thought it must definitely be some kind of scam, but he went, and did a lot of different animation work, and was a very effective employee. on the main character designer: She was rejected during the initial resume screening. A few months later, they tried recruiting again, she DM'd him on twitter, and ended up being hired. In the meantime, she'd applied to about 100 companies and was rejected by all of them. And she now draws most of the characters in Palworld. She is a new graduate, and she applied to nearly 100 companies but was rejected by all of them. (...) She doesn't like to use the word genius, but she might be a genius. I thought that post indicated some interesting things about typical hiring processes, credential evaluation, and how effectively society is utilizing talent. Typically, people say that the market is mostly efficient, and if there was financial alpha to be gained by doing hiring differently from most corporations, then there would already be companies outcompeting others by doing that. Well, here's a company doing some things differently and outcompeting other companies. Maybe there aren't enough people willing to do such things (who have the resources to) for the returns to reach an equilibrium? Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Jan 28, 2024 • 20min

LW - Epistemic Hell by rogersbacon

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Epistemic Hell, published by rogersbacon on January 28, 2024 on LessWrong. I. From Scott Alexander's review of Joe Henrich's The Secret of our Success: In the Americas, where manioc was first domesticated, societies who have relied on bitter varieties for thousands of years show no evidence of chronic cyanide poisoning. In the Colombian Amazon, for example, indigenous Tukanoans use a multistep, multi-day processing technique that involves scraping, grating, and finally washing the roots in order to separate the fiber, starch, and liquid. Once separated, the liquid is boiled into a beverage, but the fiber and starch must then sit for two more days, when they can then be baked and eaten. Such processing techniques are crucial for living in many parts of Amazonia, where other crops are difficult to cultivate and often unproductive. However, despite their utility, one person would have a difficult time figuring out the detoxification technique. Consider the situation from the point of view of the children and adolescents who are learning the techniques. They would have rarely, if ever, seen anyone get cyanide poisoning, because the techniques work. And even if the processing was ineffective, such that cases of goiter (swollen necks) or neurological problems were common, it would still be hard to recognize the link between these chronic health issues and eating manioc. Most people would have eaten manioc for years with no apparent effects. Low cyanogenic varieties are typically boiled, but boiling alone is insufficient to prevent the chronic conditions for bitter varieties. Boiling does, however, remove or reduce the bitter taste and prevent the acute symptoms (e.g., diarrhea, stomach troubles, and vomiting). So, if one did the common-sense thing and just boiled the high-cyanogenic manioc, everything would seem fine. Since the multistep task of processing manioc is long, arduous, and boring, sticking with it is certainly non-intuitive. Tukanoan women spend about a quarter of their day detoxifying manioc, so this is a costly technique in the short term. Now consider what might result if a self-reliant Tukanoan mother decided to drop any seemingly unnecessary steps from the processing of her bitter manioc. She might critically examine the procedure handed down to her from earlier generations and conclude that the goal of the procedure is to remove the bitter taste. She might then experiment with alternative procedures by dropping some of the more labor-intensive or time-consuming steps. She'd find that with a shorter and much less labor-intensive process, she could remove the bitter taste. Adopting this easier protocol, she would have more time for other activities, like caring for her children. Of course, years or decades later her family would begin to develop the symptoms of chronic cyanide poisoning. Thus, the unwillingness of this mother to take on faith the practices handed down to her from earlier generations would result in sickness and early death for members of her family. Individual learning does not pay here, and intuitions are misleading. The problem is that the steps in this procedure are causally opaque - an individual cannot readily infer their functions, interrelationships, or importance. The causal opacity of many cultural adaptations had a big impact on our psychology. Scott continues: Humans evolved to transmit culture with high fidelity. And one of the biggest threats to transmitting culture with high fidelity was Reason. Our ancestors lived in epistemic hell, where they had to constantly rely on causally opaque processes with justifications that couldn't possibly be true, and if they ever questioned them then they might die. Historically, Reason has been the villain of the human narrative, a corrosive force that tempts people away from adaptive behavio...

Jan 28, 2024 • 8min

LW - Don't sleep on Coordination Takeoffs by trevor

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Don't sleep on Coordination Takeoffs, published by trevor on January 28, 2024 on LessWrong. It's important to remember that the culture we grew up in is deeply nihilistic at its core. People expect Moloch, assume Moloch as a given, even defer to Moloch. If you read enough about business and international affairs (not news articles, those don't count, not for international affairs at least, I don't know about business), and then read about dath ilan, it becomes clear that our world is ruled by Moloch cultists who nihilistically optimized for career advancement. Humans are primates; we instinctively take important concepts and turn them into dominance/status games, including that concept itself; resulting in many people believing that important concepts do not exist at all. So it makes sense that Moloch would be an intensely prevalent part of our civilization, even ~a century after decision theory took off and ~4 centuries after mass literacy took off. Some of the first people to try to get together and have a really big movement to enlighten and reform the world was the Counter Culture movement starting in the 60's, which overlapped with the Vietnam Antiwar movement and the Civil Rights movement. The Counter Culture movement failed because they were mainly a bunch of inept teens and 20-somethings; not just lacking knowledge of decision theory or economics or Sequence-level understanding of heuristics/biases, but also because they lived in a world where social psychology and thinking-about-society were still in infancy. Like the European Enlightenment and the French Revolution before them, they started out profoundly confused about the direction to aim for and the correct moves to make (see Anna Salamon's Humans are not Automatically Strategic). The Antiwar movement permanently damaged the draft-based American military apparatus, permanently made western culture substantially more cosmopolitan than the conformist 1950s, but their ignorance and ineptitude and blunders were so immense that they shrank the Overton window on people coming together and choosing to change the world for the better. As soon as lots of people acquired an incredibly primitive version of the understandings now held by the EA, rationalist, and AI safety communities, those people started the Counter Culture movement of the 1960s in order to raise the sanity waterline above the deranged passivity of the 1950s conformist culture. And they botched it so hard, in so many ways, that everyone now cringes at the memory; the Overton window on changing the world was fouled up, perhaps intractably. Major governments and militaries also became predisposed to nip similar movements in the bud, such as the use of AI technology to psychologically disrupt groups of highly motivated people. Since then, there hasn't been a critical mass behind counter culture or societal reform, other than Black Lives Matter, the Women's March, Occupy Wall Street, and the Jan 6th Riots, which only got that many people due to heavily optimizing for memetic spread among the masses via excessively simple messages, and prevailing on already-popular sentiment such post-2008 anger at banking institutions, and likely only getting that far due to the emergence of the social media paradigm (which governments are incentivized to hijack). Game theory didn't take off until the 1950s, when it was basically absorbed by the US military, just like how economics was absorbed by the contemporary equivalent of Wall Street (and remains absorbed to this day). I'm pretty sure that the entire 20th century came and went with nearly none of them spending an hour a week thinking about solving the coordination problems facing the human race, so that the world could be better for them and their children. Even though virtually all of them would prefer to live ...

Jan 27, 2024 • 3min

LW - Aligned AI is dual use technology by lc

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Aligned AI is dual use technology, published by lc on January 27, 2024 on LessWrong. Humans are mostly selfish most of the time. Yes, many of us dislike hurting others, are reliable friends and trading partners, and care genuinely about those we have personal relationships with. Despite this, spontaneous strategic altruism towards strangers is extremely rare. The median American directs exactly 0$ to global poverty interventions, and that is a true statement regardless of whether you limit it to the Americans that make ten, fifty, a hundred, a thousand times as much money as Nigerians. Some people hope that with enough tech development we will eventually reach a "post-scarcity" regime where people have so much money that there is a global commons of resources people can access largely to their hearts content. But this has always sounded to me like a 1023 AD peasant hoping that in 2023, Americans will be so rich that no one outside America will die of a preventable disease. There will always be more for people with money to consume; even in the limits of global wealth, the free energy or resources that a person could devote to helping poor people or defending them from abuse could also be devoted to extending a personal lifespan before heat death. So in keeping with this long tradition of human selfishness, it sounds likely that if we succeed at aligning AI, the vast, vast majority of its output will get directed toward satisfying the preferences and values of the people controlling it (or possessing leverage over its continued operation) - not the "CEV of all humans", let alone the "CEV of all extant moral persons". A person deciding to use their GPUs to optimize for humanity's betterment would be the equivalent of a person hiring a maid for humanity instead of their own home; it's simply not what you expect people to do in practice, effective altruists aside. Extracting any significant extant resources from the remainder of people vulnerable to manipulation or coercion. Creating new people of moral value to serve as romantic partners, friends, and social subordinates. Getting admiration, prestige, and respect from legacy humans, possibly to extreme degrees, possibly in ways we would dislike upon reflection. Engineering new worlds where they can "help" or "save" others, depending on the operational details of their ethics. In this scenario the vast majority of beings of moral worth spread across the galaxy are not the people the AIs are working to help. They're the things that surround those people, because those oligarchs enjoy their company. And it doesn't take a genius to see why that might be worse overall than just paperclipping this corner of the cosmos, depending on who's in charge and what their preferences for "company" are, how they react to extreme power, or how much they care about the internal psychology of their peers. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner