The Nonlinear Library

The Nonlinear Fund

The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

Episodes

Mentioned books

Jan 22, 2024 • 17min

AF - We need a science of evals by Marius Hobbhahn

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: We need a science of evals, published by Marius Hobbhahn on January 22, 2024 on The AI Alignment Forum. This is a linkpost for https://www.apolloresearch.ai/blog/we-need-a-science-of-evals In this post, we argue that if AI model evaluations (evals) want to have meaningful real-world impact, we need a "Science of Evals", i.e. the field needs rigorous scientific processes that provide more confidence in evals methodology and results. Model evaluations allow us to reduce uncertainty about properties of Neural Networks and thereby inform safety-related decisions. For example, evals underpin many Responsible Scaling Policies and future laws might directly link risk thresholds to specific evals. Thus, we need to ensure that we accurately measure the targeted property and we can trust the results from model evaluations. This is particularly important when a decision not to deploy the AI system could lead to significant financial implications for AI companies, e.g. when these companies then fight these decisions in court. Evals are a nascent field and we think current evaluations are not yet resistant to this level of scrutiny. Thus, we cannot trust the results of evals as much as we would in a mature field. For instance, one of the biggest challenges Language Model (LM) evaluations currently face is the model's sensitivity to the prompts used to elicit a certain capability ( Liang et al., 2022; Mizrahi et al., 2023; Scalar et al., 2023; Weber et al., 2023, Bsharat et al., 2023). Scalar et al., 2023, for example, find that "several widely used open-source LLMs are extremely sensitive to subtle changes in prompt formatting in few-shot settings, with performance differences of up to 76 accuracy points [...]". A post by Anthropic also suggests that simple formatting changes to an evaluation, such as "changing the options from (A) to (1) or changing the parentheses from (A) to [A], or adding an extra space between the option and the answer can lead to a ~5 percentage point change in accuracy on the evaluation." As an extreme example, Bsharat et al., 2023 find that "tipping a language model 300K for a better solution" leads to increased capabilities. Overall, this suggests that under current practices, evaluations are much more an art than a science. Since evals often aim to estimate an upper bound of capabilities, it is important to understand how to elicit maximal rather than average capabilities. Different improvements to prompt engineering have continuously raised the bar and thus make it hard to estimate whether any particular negative result is meaningful or whether it could be invalidated by a better technique. For example, prompting techniques such as Chain-of-Thought prompting ( Wei et al, 2022), Tree of Thought prompting ( Yao et al., 2023), or self-consistency prompting ( Wang et al. 2022), show how LM capabilities can greatly be improved with principled prompts compared to previous prompting techniques. To point to a more recent example, the newly released Gemini Ultra model ( Gemini Team Google, 2023) achieved a new state-of-the-art result on MMLU with a new inference technique called uncertainty-routed chain-of-thought, outperforming even GPT-4. However, when doing inference with chain-of-thought@32 (sampling 32 results and taking the majority vote), GPT-4 still outperforms Gemini Ultra. Days later, Microsoft introduced a new prompting technique called Medprompt ( Nori et al., 2023), which again yielded a new Sota result on MMLU, barely outperforming Gemini Ultra. These examples should overall illustrate that it is hard to make high-confidence statements about maximal capabilities with current evaluation techniques. In contrast, even everyday products like shoes undergo extensive testing, such as repeated bending to assess material fatigue. For higher-stake things l...

Jan 22, 2024 • 14min

LW - legged robot scaling laws by bhauth

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: legged robot scaling laws, published by bhauth on January 22, 2024 on LessWrong. Fiction has lots of giant walking robots. Those designs are generally considered impractical or impossible, but they've been discussed for thousands of years, there must be something appealing about them. So, let's consider exactly what's impractical about large walking robots and what properties they'd have if they could be made. practicality Suppose you have a humanoid robot that operates in a factory. It never needs to leave the factory, so it can just sit in a wheelchair, which means it doesn't need legs, thus reducing costs. (Or you could give it tracks.) Better yet, it could just stay one place on an assembly line, so you don't even need the wheels. And then maybe it only needs one arm, so you could just take the arm. Now you're down to 1/4 the limbs of the original robot, and the legs would've been heavier because they handle more weight. And then maybe the hand can be replaced with something much simpler, like a vacuum gripper or pincer. So the result of all the cost reduction is cheap, right? Not really; commercial robotic arms are fairly expensive. Industrial equipment does only what's necessary, and it's still expensive. A lot of people designing stuff don't really understand costs. Large-scale production of goods has been heavily optimized, and the costs are very different from what they are for individuals. I've seen chemists who develop a lab-scale process using something expensive like palladium catalyst and expect it to be a good idea for industrial plants. Making a giant humanoid robot wouldn't be practical, but that's part of the point. Going to the moon wasn't practical. Giant robots are difficult, so maybe they're good for developing technology and/or showing off how good the stuff you designed is. scaling laws Still, it is possible to make walking machines with hydraulics; they're just slow and inefficient. So, that only makes sense where movement speed and efficiency don't matter much, but it turns out that those are usually important. me The scaling laws for walking animals and robots are: mass ~= height^3 sustained_power/mass ~= height^(1/2) walk_speed ~= height^(1/2) run_speed ~= height^(1/2) walk_cadence ~= height^-(1/2) run_cadence ~= height^-(1/2) joint_torque/mass ~= height structural_mass/mass ~= height/material_strength As height increases, the potential energy of falls also increases. Current humanoid robots fall over a lot during testing, but a giant robot would probably be destroyed if it fell over, and could damage property or kill someone. So, safety and reliability becomes more of an issue. Now, let's use those scaling laws to go from human numbers to a giant robot. human baseline: height = 1.8m mass = 75 kg sustained_power/mass = 4 W/kg walk_speed = 1.45 m/s run_speed = 4 m/s walk_cadence = 1.7/s run_cadence = 2.4/s giant robot: height = 12m mass = 22 tons sustained_power/mass = 10.33 W/kg sustained_power = 230 kW walk_speed = 3.74 m/s run_speed = 10.3 m/s walk_cadence = 0.66 Hz run_cadence = 0.93 Hz Some animals run faster than humans, of course. If we apply those scaling laws to ostriches, this 12m robot would have a run_speed more like 35 m/s. But humans do have some advantages over ostriches and other faster-running animals: Humans can run long distances. Humans can carry heavier backpacks than most animals. (But that's probably bad for you. Abolish textbooks etc etc.) Lots of humans can reach 9 m/s while sprinting. The above numbers are for a long-distance run. While ostriches run fast, their efficient walking speed is actually slightly slower than human walking. Natural walking speed is related to pendulum frequency. Human leg bone length is ~50% of height. If we consider a 0.9m pendulum, its natural frequency is ~0.525/s. The center of gravity...

Jan 22, 2024 • 11min

EA - Grantmakers should give more feedback by Ariel Pontes

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Grantmakers should give more feedback, published by Ariel Pontes on January 22, 2024 on The Effective Altruism Forum. Background I've been actively involved in EA since 2020, when I started EA Romania. In my experience, one problem that frustrates many grant applicants is the limited feedback offered by grantmakers. In 2022, at the EAG in London, while trying to get more detailed feedback regarding my own application at the EAIF office hours, I realized that many other people had similar complaints. EAIF's response seemed polite but not very helpful. Shortly after this experience, I also read a forum post where Linch, a junior grantmaker at the time, argued that it's "rarely worth your time to give detailed feedback." The argument was: [F]rom a grantmaking perspective, detailed feedback is rarely worthwhile, especially to rejected applicants. The basic argument goes like this: it's very hard to accurately change someone's plans based on quick feedback (and it's also quite easy to do harm if people overupdate on your takes too fast just because you're a source of funding). Often, to change someone's plans enough, it requires careful attention and understanding, multiple followup calls, etc. And this time investment is rarely enough for you to change a rejected (or even marginal) grant to a future top grant. Meanwhile, the opportunity cost is again massive. Similarly, giving useful feedback to accepted grants can often be valuable, but it just isn't high impact enough compared to a) making more grants, b) making grants more quickly, and c) soliciting creative ways to get more highest-impact grants out. Since then I have heard many others complain about the lack of feedback when applying for grants in the EA space. My specific experience was with the EAIF, but based on what I've heard I have the feeling this problem might be endemic in the EA grantmaking culture in general. The case for more feedback Linch's argument that "the opportunity cost of giving detailed feedback is massive" is only valid if by "detailed feedback" he means something really time consuming. However, it cannot be used to justify EAIF's current policy of giving no feedback at all by default, and giving literally a one-sentence piece of feedback upon request. Using this argument to justify something so extreme would be an example of what some might call "act utilitarianism", "naive utilitarianism", or "single-level" utilitarianism: it may seem that, in certain cases, giving feedback is a waste of resources compared to other counterfactual actions. If you only consider first-order consequences, however, killing a healthy checkup patient and using his organs to save five is also effective. In reality, we need to also consider higher order consequences. Is it healthy for a movement to adopt a policy of not giving feedback to grant applicants? Personally, I feel such a policy runs the risk of seeming disrespectful towards grant applicants who spend time and energy planning projects that end up never being implemented. This is not to say that the discomfort of disappointed applicants counts more than the suffering of Malaria infected children. But we are human and there is a limit to how much we can change via emotional resilience workshops. Besides, there is such a thing as too much resilience. I have talked to other EAs who applied for funds, 1:1 advice from 80k, etc, and many of them felt frustrated and somewhat disrespected after being rejected multiple times with no feedback or explanation. I find this particularly worrisome in the case of founders of national groups, since our experience may influence the development of the local movement. There is a paragraph from an article by The Economist which I think adds to my point: As the community has expanded, it has also become more exclusive. Conference...

Jan 22, 2024 • 11min

EA - Why I'm skeptical about using subjective time experience to assign moral weights by Andreas Mogensen

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why I'm skeptical about using subjective time experience to assign moral weights, published by Andreas Mogensen on January 22, 2024 on The Effective Altruism Forum. This post provides a summary of my working paper " Welfare and Felt Duration." The goal is to make the content of the paper more accessible and to add context and framing for an EA audience, including a more concrete summary of practical implications. It's also an invitation for you to ask questions about the paper and/or my summary of it, to which I'll try to reply as best I can below. What's the paper about? The paper is about how duration affects the goodness and badness of experiences that feel good or bad. For simplicity, I mostly focus on how duration affects the badness of pain. In some obvious sense, pains that go on for longer are worse for you. But we can draw some kind of intuitive distinction between how long something really takes and how long it is felt as taking. Suppose you could choose between two pains: one feels longer but is objectively shorter, and the other feels shorter but is objectively longer. Now the choice isn't quite so obvious. Still, some people are quite confident that you ought to choose the second: the one that feels shorter. They think it's how long a pain feels that's important, not how long it is. The goal of the paper is to argue that that confidence isn't warranted. Why is this important? This issue affects the moral weights assigned to non-human animals and digital minds. The case for thinking that subjective time experience varies across the animal kingdom is summarized in this excellent post by Jason Schukraft, which was a huge inspiration for this paper. One particular line of evidence comes from variation in the critical-flicker fusion frequency (CFF), the frequency at which a light source that's blinking on and off is perceived as continuously illuminated. Some birds and insects can detect flickering that you and I would completely miss unless we watched a slow motion recording. That might be taken to indicate that time passes more slowly from their subjective perspective, and so, if felt duration is what matters, that suggests we should give additional weight to the lifetime welfare of those animals. here. A number of people also argue that digital minds could experience time very differently from us, and here the differences could get really extreme. Because of the speed advantages of digital hardware over neural wetware, a digital mind could conceivably be run at speeds many orders of magnitude higher than the brain's own processing speed, which might again lead us to expect that time will be felt as passing much more slowly. As above, this may be taken to suggest that we should give those experiences significantly greater moral weight. paper on digital minds. What's the argument? You can think of the argument of the paper as having three key parts. Part 1: What is felt duration? The first thing I want to do in the paper is emphasize that we don't really have a very strong idea of what we're talking about when we talk about the subjective experience of time. That should make us skeptical of our intuitions about the ethical importance of felt duration. It seems clear that it doesn't matter in itself how much time you think has passed: e.g., if you think the pain went on for six minutes, but actually it lasted five. If subjective duration is going to matter, it can't be just a matter of your beliefs about time's passage. Something about the way the pain is experienced has got to be different. But what exactly? I expect you probably don't have an obvious answer to that question at your fingertips. I certainly don't. It's also worth noting that some psychologists who study time perception claim that we can't distinguish empirically between judged and felt durati...

Jan 22, 2024 • 14min

AF - InterLab - a toolkit for experiments with multi-agent interactions by Tomáš Gavenčiak

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: InterLab - a toolkit for experiments with multi-agent interactions, published by Tomáš Gavenčiak on January 22, 2024 on The AI Alignment Forum. This post introduces InterLab, a toolkit for experiments with multi-agent interaction. We plan to release more posts on the overall project, technical details, design considerations, and concrete research projects and ideas over the next few weeks to months. This post focuses on the motivation behind the project and touches upon more high-level considerations; if you want to jump to the implementation itself or want to start experimenting, you can jump directly to the Getting started section, to the InterLab GitHub repo with project overview and further links, or explore the example Google Colab notebook. Motivation The research agenda of ACS is primarily focused on understanding complex interactions of humans and AI agents, both on individual level and between systems or institutions, and both theoretically and empirically. Future going well in our view depends not just on the narrow ability to point one AI system to some well specified goal, but broadly on a complex system composed of both AIs and humans to develop in a way which is conducive to human flourishing. This points to a somewhat different set of questions than traditional AI safety, including problems such as "how to deal with the misalignment between individual humans, or between humans and institutions?", "how to avoid AIs amplifying conflict?", "how will institutions running on AI cognition (rather than human cognition) work?", "how do aggregate multi-agent entities evolve?", or "what happens if you replace part of human-human interactions in the society with AI-AI interactions?". While many of these questions likely require a better theoretical and conceptual understanding, it is also possible to study them empirically, using LLMs and LLM-based agents, which can also inform our models and intuitions. We may build more comprehensive language model evaluations for near-term alignment, in particular in the direction of multi-agent evaluations - this is indeed one of the goals of InterLab. We may learn about strategies for resolving conflicts and disagreements, and robust cooperation, as well as models of manipulation and coercion, in particular under information and power imbalances. We may create new technologies for human coordination, cooperation and empowerment, such as negotiation aids or aids for solving internal conflicts in individual humans. Multi-agent systems of humans and AIs come with a specific and understudied set of risks (longer report forthcoming). Better empirical understanding of systems of interacting LLMs can help us better understand the space of intelligent systems occupied by collective intelligences and superagents. There is some risk of over-updating our models and intuitions based on the current AI systems that needs to be taken into account, but alignment theory developed more in touch with experiments seems like a useful direction. Another intuition behind this work is the insight that sometimes it is easier to understand or predict the behavior of a system of agents as a whole and based on simplified models, rather than to e.g. model the individuals accurately and then model the system primarily as a collection of individuals. For example, modeling the flow of passengers in a metropolitan transit system is notably easier than understanding individual humans and their reasons why they move in some particular ways. (In fact, some systems in human civilization are specifically designed to avoid the outcome being too influenced by properties of individuals, e.g. Empirical language model research and experimentation are taking off quickly both within industry and mainstream ML and other fields (social sciences, fairness) and it is hardl...

Jan 22, 2024 • 12min

EA - Rates of Criminality Amongst Giving Pledge Signatories by Ben West

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Rates of Criminality Amongst Giving Pledge Signatories, published by Ben West on January 22, 2024 on The Effective Altruism Forum. Summary I investigate the rates of criminal misconduct amongst people who have taken The Giving Pledge (roughly: ~200 [non-EA] billionaires who have pledged to give most of their money to charity). I find that rates are fairly high: 25% of signatories have been accused of financial misconduct, and 10% convicted[1] 4% of signatories have spent at least one day in prison Overall, 41% of signatories have had at least one allegation of substantial misconduct (financial, sexual, or otherwise) I estimate that Giving Pledgers are not less likely, and possibly more likely, to commit financial crimes than YCombinator entrepreneurs. I am unable to find evidence of The Giving Pledge doing anything to limit the risk of criminal behavior amongst its members, though I have heard second-hand that they do some sort of screening. I conclude that the rate of criminal behavior amongst major philanthropists is high, which means that we should not expect altruism to substantially lower the risks compared to that of the general population, and that negative impacts to EA's public perception may occur independently of whether our donors actually commit crimes (e.g. because even noncriminal billionaires have a negative public image). Methodology I copied the list of signatories from their website. Gina Stuessy and I searched the internet for "(name) lawsuit", "(name) crime" and also looked at their Wikipedia page. I categorized any results into "financial", "sexual", and "other", and also marked if they had spent at least one day in jail. Gina and I eventually decided that the data collection process was too time-consuming, and we stopped partway through. The final dataset includes 115 of the 232 signatories.[2][3] Data can be found here. How well do convictions correspond with immoral behavior? It is a well-worn take that our[4] legal system overly protects white-collar criminals: If an employee steals $20 from the cash register, that's a criminal offense that the police will prosecute, but if an employer under-pays their employees by $20 that's a civil offense where the police don't get involved. I found that the punishment of the criminals in my data set correlated extremely poorly with my intuition for how immorally they had behaved. It would be funny if it weren't sad that one of the longest prison sentences in my data set is from Kjell Inge Røkke, a Norwegian businessman who was convicted of having an illegal license for his yacht. One particular way in which white-collar offenses are weird is that they often allow the defendant to settle without admitting wrongdoing.[5] E.g. my guess is that Philip Frost is guilty, but his settlement with the SEC does not require him to admit wrongdoing. I wasn't able to find a single person who admitted guilt in a sexual misconduct case, despite ~7% of the signatories being accused, including in high-profile cases like people involved with Jeffrey Epstein.[6] I was considering trying to add some classification like "Ben thinks this person is guilty" but decided that this would be too time-consuming and subjective. Nonetheless, if you want my subjective opinion, my guess is that most of the people who were accused of financial misconduct are guilty of immoral behavior, under a commonsense morality definition of the term. Less controversially, some of these cases are ongoing, and presumably at least some of them will result in convictions, which makes looking only at the current conviction rate misleading. In any case though, I believe that this data set establishes that the base rate of both criminal and immoral behavior is fairly high among major philanthropists, no matter how you slice the data. Some Representative Case...

Jan 22, 2024 • 13min

LW - On "Geeks, MOPs, and Sociopaths" by alkjash

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: On "Geeks, MOPs, and Sociopaths", published by alkjash on January 22, 2024 on LessWrong. Hey, alkjash! I'm excited to talk about some of David Chapman's work with you. Full disclosure, I'm a big fan of Chapman's in general and also a creator within the meta/post-rationality scene with him (to use some jargon to be introduced very shortly). You mentioned being superficially convinced of a post he wrote a while ago about how subcultures collapse called "Geeks, MOPs, and sociopaths in subculture evolution". In it he makes a few key claims that, together, give a model of how subcultures grow and decline: Subcultures come into existence when a small group of creators start a scene (people making things for each other) and then draw a group of fanatics who support the scene. Creators and fanatics are the "geeks". A subculture comes into existence around the scene when it gets big and popular enough to attract MOPs (members of the public). These people are fans but not fanatics. They don't contribute much other than showing up and having a good time. If a subculture persists long enough, it attracts sociopaths who prey on the MOPs to exploit them for money, sex, etc. Although MOPs sometimes accidentally destroy subcultures by diluting the scene too much, sociopaths reliably kill subcultures by converting what was cool about the scene into something that can be packaged to sold to MOPs as a commodity that is devoid of everything that made it unique and meaningful. The main way to fight this pattern is to defend against too many MOPs overwhelming the geeks (Chapman suggests a 6:1 MOP to geek ratio) and to aggressively keep out the sociopaths. There's also a 6th claim that we can skip for now, which is about what Chapman calls the fluid mode and the complete stance, as talking about it would require importing a lot of concepts from his hypertext book Meaningness. To get us started, I'd be interested to know what you find convincing about his claims, and what, if anything, makes you think other models may better explain how subcultures evolve. In my head I'm running this model against these examples: academic subfields, gaming subreddits and discords, fandoms, internet communities, and startups. Do tell me which of these count as "subcultures" in Chapman's framing. Let me start with the parts of the model I find convincing. When subcultures grow (too) rapidly, there is an influx of casual members that dilutes the culture and some tension between the old guard and the new fans. This agrees with what I know about startups, gaming subcultures, and fandoms. It does explain the longevity of academic cultures known for our extreme gatekeeping. In Chinese there is a saying/meme 有人的地方就是江湖, which I would loosely translate as "where there are people there is politics." It seems obvious to me that in the initial stage a subculture will be focused on object reality (e.g. a fandom focused on an anime, a subreddit focused on a video game, etc.), but as people join, politics and social reality will play a larger and larger role (competition over leadership positions, over power and influence, over abstractions like community values not directly tied to the original thing). As the low-hanging fruits of innovation in object reality (e.g. geeks coming up with new build orders in starcraft, bloggers coming up with new rationality techniques) dry up, there is a tendency for those good at playing social reality games to gain progressively more influence. Here are some parts that I'm not sure about, or find suspicious, or disagree with: At least on a superficial reading there seems to be an essentialist pigeonholing of people into the Geek/Mop/Sociopath trichotomy. It seems to me more persuasive that all members of a scene have the capacity for all 3 roles, and on average the "meta" shifts as the ev...

Jan 22, 2024 • 20min

LW - Book review: Cuisine and Empire by eukaryote

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Book review: Cuisine and Empire, published by eukaryote on January 22, 2024 on LessWrong. People began cooking our food maybe two million years ago and have not stopped since. Cooking is almost a cultural universal. Bits of raw fruit or leaves or flesh are a rare occasional treat or garnish - we prefer our meals transformed. There are other millennia-old procedures we do to make raw ingredients into cooking: separating parts, drying, soaking, slicing, grinding, freezing, fermenting. We do all of this for good reason: Cooking makes food more calorically efficient and less dangerous. Other techniques contribute to this, or help preserve food over time. Also, it tastes good. Cuisine and Empire by Rachel Laudan is an overview of human history by major cuisines - the kind of things people cooked and ate. It is not trying to be a history of cultures, agriculture, or nutrition, although it touches on all of these things incidentally, as well as some histories of things you might not expect, like identity and technology and philosophy. Grains (plant seeds) and roots were the staples of most cuisines. They're relatively calorically dense, storeable, and grow within a season. Remote islands really had to make do with whatever early colonists brought with them. Not only did pre-Columbian Hawaii not have metal, they didn't have clay to make pots with! They cooked stuff in pits. Running in the background throughout a lot of this is the clock of domestication - with enough time and enough breeding you can make some really naturally-digestible varieties out of something you'd initially have to process to within an inch of its life. It takes time, quantity, and ideally knowledge and the ability to experiment with different strains to get better breeds. Potatoes came out of the Andes and were eaten alongside quinoa. Early potato cuisines didn't seem to eat a lot of whole or cut-up potatoes - they processed the shit out of them, chopping, drying or freeze-drying them, soaking them, reconstituting them. They had to do a lot of these because the potatoes weren't as consumer-friendly as modern breeds - less digestible composition, more phytotoxins, etc. As cities and societies caught on, so did wealth. Wealthy people all around the world started making "high cuisines" of highly-processed, calorically dense, tasty, rare, and fancifully prepared ingredients. Meat and oil and sweeteners and spices and alcohol and sauces. Palace cooks came together and developed elaborate philosophical and nutritional theories to declare what was good to eat. Things people nigh-universally like to eat: Salt Fat Sugar Starch Sauces Finely-ground or processed things A variety of flavors, textures, options, etc Meat Drugs Alcohol Stimulants (chocolate, caffeine, tea, etc) Things they believe are healthy Things they believe are high-class Pure or uncontaminated things (both morally and from, like, lead) All people like these things, and low cuisines were not devoid of joy, but these properties showed up way more in high cuisines than low cuisines. Low cuisines tended to be a lot of grain or tubers and bits of whatever cooked or pickled vegetables or meat (often wild-caught, like fish or game) could be scrounged up. In the classic way that oppressive social structures become self-reinforcing, rich people generally thought that rich people were better-off eating this kind of diet - carefully balanced - whereas it wasn't just necessary, it was good for the poor to eat meager, boring foods. They were physically built for that. Eating a wealthy diet would harm them. In lots of early civilizations, food and sacrifice of food was an important part of religion. Gods were attracted by offered meals or meat and good smells, and blessed harvests. There were gods of bread and corn and rice. One thing I appreciate about this...

Jan 22, 2024 • 5min

LW - When Does Altruism Strengthen Altruism? by jefftk

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: When Does Altruism Strengthen Altruism?, published by jefftk on January 22, 2024 on LessWrong. Joey Savoie recently wrote that Altruism Sharpens Altruism: I think many EAs have a unique view about how one altruistic action affects the next altruistic action, something like altruism is powerful in terms of its impact, and altruistic acts take time/energy/willpower; thus, it's better to conserve your resources for these topmost important altruistic actions (e.g., career choice) and not sweat it for the other actions. However, I think this is a pretty simplified and incorrect model that leads to the wrong choices being taken. I wholeheartedly agree that certain actions constitute a huge % of your impact. In my case, I do expect my career/job (currently running Charity Entrepreneurship) will be more than 90% of my lifetime impact. But I have a different view on what this means for altruism outside of career choices. I think that being altruistic in other actions not only does not decrease my altruism on the big choices but actually galvanizes them and increases the odds of me making an altruistic choice on the choices that really matter. (more...) How motivation works varies a lot between people, but I think both of these models have elements of truth and elements where they lead people in less helpful directions, mostly depending on their current situation. An analogy: say you need to carry important heavy things. If you only rarely need to do this, then an approach of 'conserving' your strength by avoiding carrying anything but the most important things would work terribly: your strength grows as you use it. You'd do much better to often carry unimportant heavy things, growing stronger, so that when it's important you're in good shape. On the other hand, if you're carrying important heavy things most of the day and are about as strong as you're going to get, carrying additional unimportant ones can cut into your ability to carry the important ones. And if you overload yourself you can get injured, possibly severely. This is still a pretty simplified model, and we don't know that capacity for altruism functions analogously to muscle strength, but I do think it fits observations pretty well. Most of us probably know people who (or ourselves have): Dove into altruism, picked up a bunch of new habits (ex: volunteering, donating blood, donating money, veganism, frugality, tutoring, composting, switching jobs, avoiding wasteful packaging, using a clothesline, adopting shelter animals, taking cold showers), and found these energizing and mutually reinforcing. While some of these are far more impactful than others, bundling some together can help build a new self-image as a more ethical and caring person. You can't practice altruistically switching jobs every day, but you can practice taking the bus. Had an altruistic habit expand to take much more of their efforts than really made sense, or even became counterproductive. Like, much less effective at their normally-impactful work because they're unwilling to put money into prioritizing parental sleep, running into health issues around veganism, or exhausted by house drama while trying to save money living in groups. Had altruistic habits that made sense in one situation stop making sense when their situation changed, by which point they were ingrained and hard to change. It's easier to be vegetarian in Delhi than Manila, and generally easier in urban areas than rural ones. Donating a lot makes less sense if you're altruistically-funded. Thriftiness or volunteering make less sense if they're keeping you from more valuable work. Pushed themself too hard, and burned out. On the other hand, just as there are far more opportunities for carrying heavy things than you could possibly take on, there are also far more opportunities for ...

Jan 21, 2024 • 10min

AF - A Shutdown Problem Proposal by johnswentworth

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A Shutdown Problem Proposal, published by johnswentworth on January 21, 2024 on The AI Alignment Forum. Goal First things first: this is not (yet) aimed at solving the whole corrigibility problem, or even the whole shutdown problem. The main thing this proposal is intended to do is to get past the barriers MIRI found in their old work on the shutdown problem. In particular, in a toy problem basically-identical to the one MIRI used, we want an agent which: Does not want to manipulate the shutdown button Does respond to the shutdown button Does want to make any child-agents it creates responsive-but-not-manipulative to the shutdown button, recursively (i.e. including children-of-children etc) If I understand correctly, this is roughly the combination of features which MIRI had the most trouble achieving simultaneously. Both this proposal and the argument that it works are simple enough to be easy to extend. The main hope of this post is that other people will pick up the proposal and run with it, beyond the very simple toy problem considered here, to more complex problems (and more complex failure modes). Insofar as this proposal gets us past the previous barriers, we want to find out what the next barriers will be. We will not be very formal here. Setup The agent takes actions in two timesteps. First, the agent takes an action (possibly given some observations). In between steps, a shutdown button is either pressed or not-pressed. Then, in the second timestep, the agent observes the button state (and possibly other things), then takes another action. Note that each "action" could itself be a rather large composite of many subactions, like e.g. spinning up child-agents. We'll assume that we have two utility functions on hand: utility function u2 incentivizes shutdown, and utility function u1 provides non-shutdown behavior. Of course we don't actually know how to construct two such utility functions; we're using these assumptions because they're basically the same assumptions used in the MIRI paper. Very roughly speaking, we want the agent to follow utility function u1 until the shutdown button is pressed, and then follow u2. A little less roughly speaking, we want the agent to maintain optionality: if it were just to follow u1 until the shutdown button were pressed, then e.g. it might create u1-maximizing child agents which would resist shutdown, or more generally it might precommit in ways which decrease its ability to switch to u2 after the button is pressed, etc, because those things would all likely result in higher u1. Also, of course, we want the agent to not manipulate the shutdown button, the humans deciding whether to press the button, etc. Proposal First key piece: we'll use a subagent architecture. The AI itself contains two expected-utility-maximizing subagents: one which pursues u1, and one which pursues u2. Each timestep, both subagents must agree on which action the AI will take, otherwise the AI takes some "default" action. (Put differently: each agent has a veto, and if either vetoes then the AI takes the default action.) Second key piece: each of the two subagents maximizes expected utility under a different counterfactual. First subagent maximizes E[u1 | do(button = unpressed), observations] Second subagent maximizes E[u2 | do(button = pressed), observations] So conceptually: The first subagent maximizes u1, and acts as though the button will magically be unpressed in between timesteps, in a way which is not causally downstream of anything in the universe. The second subagent maximizes u2, and acts as though the button will magically be pressed in between timesteps, in a way which is not causally downstream of anything in the universe. We will assume that the AI internals include infrastructure for the two subagents to negotiate with each other, form...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner