The Nonlinear Library

The Nonlinear Fund
undefined
Jan 4, 2024 • 5min

EA - Project ideas for making transformative AI go well, other than by working on alignment by Lukas Finnveden

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Project ideas for making transformative AI go well, other than by working on alignment, published by Lukas Finnveden on January 4, 2024 on The Effective Altruism Forum. This is a series of posts with lists of projects that it could be valuable for someone to work on. The unifying theme is that they are projects that: Would be especially valuable if transformative AI is coming in the next 10 years or so. Are not primarily about controlling AI or aligning AI to human intentions.[1] Most of the projects would be valuable even if we were guaranteed to get aligned AI. Some of the projects would be especially valuable if we were inevitably going to get misaligned AI. The posts contain some discussion of how important it is to work on these topics, but not a lot. For previous discussion (especially: discussing the objection "Why not leave these issues to future AI systems?"), you can see the section How ITN are these issues? from my previous memo on some neglected topics. The lists are definitely not exhaustive. Failure to include an idea doesn't necessarily mean I wouldn't like it. (Similarly, although I've made some attempts to link to previous writings when appropriate, I'm sure to have missed a lot of good previous content.) There's a lot of variation in how sketched out the projects are. Most of the projects just have some informal notes and would require more thought before someone could start executing. If you're potentially interested in working on any of them and you could benefit from more discussion, I'd be excited if you reached out to me! [2] There's also a lot of variation in skills needed for the projects. If you're looking for projects that are especially suited to your talents, you can search the posts for any of the following tags (including brackets): [ML] [Empirical research] [Philosophical/conceptual] [survey/interview] [Advocacy] [Governance] [Writing] [Forecasting] The projects are organized into the following categories (which are in separate posts). Feel free to skip to whatever you're most interested in. Governance during explosive technological growth It's plausible that AI will lead to explosive economic and technological growth. Our current methods of governance can barely keep up with today's technological advances. Speeding up the rate of technological growth by 30x+ would cause huge problems and could lead to rapid, destabilizing changes in power. This section is about trying to prepare the world for this. Either generating policy solutions to problems we expect to appear or addressing the meta-level problem about how we can coordinate to tackle this in a better and less rushed manner. A favorite direction is to develop Norms/proposals for how states and labs should act under the possibility of an intelligence explosion. Epistemics This is about helping humanity get better at reaching correct and well-considered beliefs on important issues. If AI capabilities keep improving, AI could soon play a huge role in our epistemic landscape. I think we have an opportunity to affect how it's used: increasing the probability that we get great epistemic assistance and decreasing the extent to which AI is used to persuade people of false beliefs. A couple of favorite projects are: Create an organization that gets started with using AI for investigating important questions or Develop & advocate for legislation against bad persuasion. Sentience and rights of digital minds. It's plausible that there will soon be digital minds that are sentient and deserving of rights. This raises several important issues that we don't know how to deal with. It seems tractable both to make progress in understanding these issues and in implementing policies that reflect this understanding. A favorite direction is to take existing ideas for what labs could be doing and spell ou...
undefined
Jan 3, 2024 • 14min

EA - Research summary: farmed yellow mealworm welfare by abrahamrowe

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Research summary: farmed yellow mealworm welfare, published by abrahamrowe on January 3, 2024 on The Effective Altruism Forum. This post is a short summary of a peer-reviewed, open access publication on yellow mealworm welfare in the Journal of Insects as Food and Feed. The paper and supplemental information can be accessed here. The original paper was written by Meghan Barrett, Rebekah Keating Godfrey, Alexandra Schnell, and Bob Fischer; the research conducted in the paper was funded by Rethink Priorities. This post was written by Abraham Rowe and reviewed by Meghan Barrett. Unless cited otherwise, all information is derived from the Barrett et al. 2023 publication. Summary As of 2020, around 300 billion yellow mealworms (Tenebrio molitor) are farmed annually (though recent estimates now put this figure at over 3 trillion individuals ( Pells, 2023 )). Barrett et al. 2023 is the first publication to consider species-specific welfare concerns for farmed mealworms. The authors identify 15 current and future welfare concerns, including more pressing current concerns such as: Disease - Bacterial, fungal, protist, and viral pathogens can cause sluggishness, tissue damage, slowed growth, increased susceptibility to other diseases, and even mass-mortality events. High larval rearing densities - Density can cause a range of negative effects, including increased cannibalism and disease, higher chances of heat-related death, competition over food leading to malnutrition, and behavioral restriction near pupation. Inadequate larval nutrition - This may result from not providing enough protein in the animals' largely grains-based diet. Light use during handling - Photophobic adults and larvae may experience significant stress due to light use during handling. Slaughter methods - While we have high empirical uncertainty about the relative harms of slaughter methods, it is clear that some approaches to slaughter and depopulation on farms are more harmful than others. Future concerns that haven't yet been realized on farms include: Novel, potentially toxic, or inadequate feed substrates - Polymers (like plastics) and mycotoxin-contaminated grains may be more likely to be used in the future. Selective breeding and genetic modification - In vertebrate animals, selective breeding has caused a large number of welfare issues. The same might be expected to become true for mealworms. Current rearing and slaughter practices Yellow mealworms are the larval instars of a species of darkling beetle, Tenebrio molitor. Larvae go through a number of molts prior to pupation, which can take between a few months to two years depending on nutrition and abiotic conditions. Mealworms take up to 20 days to pupate. After pupating, the emerged adult beetles will mate within 3-5 days. Mealworms are a popular insect to farm for food due to their rapid growth, high nutrient content, and ease of handling. Adults are typically only used for breeding, while large larvae are sold as food and feed. Mealworms typically consume decaying grains, but have been reported to eat a wide variety of other foods in certain circumstances (including dead insects, other mealworms, and decaying wood). In farmed conditions, larval mealworms are fed a diet of 70%-85% cereals and other carbohydrates, and may be provided with supplementary protein, fruit, or vegetables. Mealworms are reared in stackable crates, usually with screened bottoms to allow frass (insect excrement) to fall through and not accumulate. Mealworms may be reared in up to 24-hour darkness, as they are photophobic. Insects bound for slaughter are collected at around 100 mg. Prior to slaughter, insects are sieved out of the substrate, washed (to remove frass and other waste from the exterior surface of their bodies), and prevented from eating for up to two days (ca...
undefined
Jan 3, 2024 • 26min

AF - What's up with LLMs representing XORs of arbitrary features? by Sam Marks

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What's up with LLMs representing XORs of arbitrary features?, published by Sam Marks on January 3, 2024 on The AI Alignment Forum. Thanks to Clément Dumas, Nikola Jurković, Nora Belrose, Arthur Conmy, and Oam Patel for feedback. In the comments of the post on Google Deepmind's CCS challenges paper, I expressed skepticism that some of the experimental results seemed possible. When addressing my concerns, Rohin Shah made some claims along the lines of "If an LLM linearly represents features a and b, then it will also linearly represent their XOR, ab, and this is true even in settings where there's no obvious reason the model would need to make use of the feature ab."[1] For reasons that I'll explain below, I thought this claim was absolutely bonkers, both in general and in the specific setting that the GDM paper was working in. So I ran some experiments to prove Rohin wrong. The result: Rohin was right and I was wrong. LLMs seem to compute and linearly represent XORs of features even when there's no obvious reason to do so. I think this is deeply weird and surprising. If something like this holds generally, I think this has importance far beyond the original question of "Is CCS useful?" In the rest of this post I'll: Articulate a claim I'll call "representation of arbitrary XORs (RAX)": LLMs compute and linearly represent XORs of arbitrary features, even when there's no reason to do so. Explain why it would be shocking if RAX is true. For example, without additional assumptions, RAX implies that linear probes should utterly fail to generalize across distributional shift, no matter how minor the distributional shift. (Empirically, linear probes often do generalize decently.) Present experiments showing that RAX seems to be true in every case that I've checked. Think through what RAX would mean for AI safety research: overall, probably a bad sign for interpretability work in general, and work that relies on using simple probes of model internals (e.g. ELK probes or coup probes) in particular. Make some guesses about what's really going on here. Overall, this has left me very confused: I've found myself simultaneously having (a) an argument that AB, (b) empirical evidence of A, and (c) empirical evidence of B. (Here A = RAX and B = other facts about LLM representations.) The RAX claim: LLMs linearly represent XORs of arbitrary features, even when there's no reason to do so To keep things simple, throughout this post, I'll say that a model linearly represents a binary feature f if there is a linear probe out of the model's latent space which is accurate for classifying f; in this case, I'll denote the corresponding direction as vf. This is not how I would typically use the terminology "linearly represents" - normally I would reserve the term for a stronger notion which, at minimum, requires the model to actually make use of the feature direction when performing cognition involving the feature[2]. But I'll intentionally abuse the terminology here because I don't think this distinction matters much for what I'll discuss. If a model linearly represents features a and b, then it automatically linearly represents ab and ab. However, ab is not automatically linearly represented - no linear probe in the figure above would be accurate for classifying ab. Thus, if the model wants to make use of the feature ab, then it needs to do something additional: allocate another direction[3] (more model capacity) to representing ab, and also perform the computation of ab so that it knows what value to store along this new direction. The representation of arbitrary XORs (RAX) claim, in its strongest form, asserts that whenever a LLM linearly represents features a and b, it will also linearly represent ab. Concretely, this might look something like: in layer 5, the model computes and linearly r...
undefined
Jan 3, 2024 • 5min

AF - Safety First: safety before full alignment. The deontic sufficiency hypothesis. by Chipmonk

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Safety First: safety before full alignment. The deontic sufficiency hypothesis., published by Chipmonk on January 3, 2024 on The AI Alignment Forum. It could be the case that these two goals are separable and independent: "AI safety": avoiding existential risk, s-risk, actively negative outcomes "AI getting-everything-we-want" ( CEV) This is what Davidad calls this the Deontic Sufficiency Hypothesis. If the hypothesis is true, it should be possible to de-pessimize and mitigate the urgent risk from AI without necessarily ensuring that AI creates actively positive outcomes. Because, for safety, it is only necessary to ensure that actively harmful outcomes do not occur. And hopefully this is easier than achieving "full alignment". Safety first! We can figure out the rest later. Quotes from Davidad's The Open Agency Architecture plans This is Davidad's plan with the Open Agency Architecture (OAA). A list of core AI safety problems and how I hope to solve them (2023 August) 1.1. First, instead of trying to specify "value", instead "de-pessimize" and specify the absence of a catastrophe, and maybe a handful of bounded constructive tasks like supplying clean water. A de-pessimizing OAA would effectively buy humanity some time, and freedom to experiment with less risk, for tackling the CEV-style alignment problem - which is harder than merely mitigating extinction risk. Davidad's Bold Plan for Alignment: An In-Depth Explanation - LessWrong (2023 April) Deontic Sufficiency Hypothesis: This hypothesis posits that it is possible to identify desiderata that are adequate to ensure the model doesn't engage in undesirable behavior. Davidad is optimistic that it's feasible to find desiderata ensuring safety for a few weeks before a better solution is discovered, making this a weaker approach than solving outer alignment. For instance, Davidad suggests that even without a deep understanding of music, you can be confident your hearing is safe by ensuring the sound pressure level remains below 80 decibels. However, since the model would still be executing a pivotal process with significant influence, relying on a partial solution for decades could be risky. Getting traction on the deontic feasibility [sic] hypothesis Davidad believes that using formalisms such as Markov Blankets would be crucial in encoding the desiderata that the AI should not cross boundary lines at various levels of the world-model. We only need to "imply high probability of existential safety", so according to davidad, "we do not need to load much ethics or aesthetics in order to satisfy this claim (e.g. we probably do not get to use OAA to make sure people don't die of cancer, because cancer takes place inside the Markov Blanket, and that would conflict with boundary preservation; but it would work to make sure people don't die of violence or pandemics)". Discussing this hypothesis more thoroughly seems important. An Open Agency Architecture for Safe Transformative AI (2022 December) Deontic Sufficiency Hypothesis: There exists a human-understandable set of features of finite trajectories in such a world-model, taking values in (,0], such that we can be reasonably confident that all these features being near 0 implies high probability of existential safety, and such that saturating them at 0 is feasible[2] with high probability, using scientifically-accessible technologies. I am optimistic about this largely because of recent progress toward formalizing a natural abstraction of boundaries by Critch and Garrabrant. I find it quite plausible that there is some natural abstraction property Q of world-model trajectories that lies somewhere strictly within the vast moral gulf of All Principles That Human CEV Would EndorseQDon't Kill Everyone AI Neorealism: a threat model & success criterion for existential safety (2022...
undefined
Jan 3, 2024 • 7min

EA - Apply now to CE's second Research Training Program by CE

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Apply now to CE's second Research Training Program, published by CE on January 3, 2024 on The Effective Altruism Forum. What we have learned from our pilot and when our next program is happening TL;DR: We are excited to announce the second round of our Research Training Program. This online program is designed to equip participants with the tools and skills needed to identify, compare, and recommend the most effective charities, interventions, and organisations. It is a full-time (35 hours per week), fully cost-covered program that will run remotely for 12 weeks. [APPLY HERE] Deadline for application: January 28, 2024. The program dates are April 15 - July 5, 2024. If you are progressing to the last stage of the application process you will receive a final decision by the 15th of March the latest. Please let us know if you need a decision before that date. What have we learned from our pilot? The theory of change for the Research Training Program has three outputs: helping train people to switch into impactful research positions, creating intervention reports that influence CE's decisions of which new organisations to start, and creating evaluations that help organisations have the most impact and funders to make impact maximising decisions. We have outlined what we have learned about each of these aspects below: Intervention reports: For eight out of the eleven weeks, the sixteen research fellows have investigated fifteen cause areas, created forty-six shallow reviews, and written twenty-two deep dives, five of which have already been published on the EA forum (find them here). Although we are planning some changes to improve the fellows' experience in the program, we are deeply impressed by these results and look forward to replicating them with a slightly different approach. People: Since the program ended only a couple of weeks ago, it is too early to tell what career switches will happen because of the program. We have some early and very promising results with two research fellows already having made career changes that we consider highly impactful. If you are currently hiring and are interested in people with intervention prioritisation skills applying, please contact us. Charity evaluations: Traditionally, Charity Entrepreneurship has focused most of its research on investigating the potential impact of interventions. We believe that more impact-focused accountability is essential for the sector, and we would like to support the evaluated organisations and funders in making more informed decisions. This is why at the end of the program the research fellows focused on writing charity evaluations in group projects. We were too confident in our timelines and are planning a major restructuring of this part of the program. However, we are happy that three evaluations could be shared directly with the evaluated organisations. We are looking forward to learning from other evaluators in the space. What will the next program look like? Content: The program will start with a week of providing an overview of the most important research skills. The program's first part will then focus on writing cause area reports in groups in which fellows take a problem and identify the most promising solutions. Afterwards, the fellows investigate those most promising ideas through a shallow review. After conducting a shallow review, research fellows will evaluate the most promising interventions through a deep dive, which will be polished and published, and influence decision-making within Charity Entrepreneurship and beyond. After these reports are published, there will be some time to think about careers and apply to different opportunities, before jumping into some charity evaluations that can influence the decisions of funders as well as strategic decisions within the evaluated o...
undefined
Jan 3, 2024 • 12min

EA - My Experience Donating Blood Stem Cells, or: Why You Should Join a Bone Marrow Registry by Silas Strawn

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My Experience Donating Blood Stem Cells, or: Why You Should Join a Bone Marrow Registry, published by Silas Strawn on January 3, 2024 on The Effective Altruism Forum. Note: I'm not a doctor. Please don't make decisions about your health based on an EA forum post before at least talking with a physician or other licensed healthcare practitioner. TLDR: I donated blood stem cells in early 2021. Immediately prior, I had been identified as the best match for someone in need of a bone marrow transplant, likely with leukemia, lymphoma, or similar condition. Although the first attempt to collect my blood stem cells failed, my experience was overwhelmingly positive as well as fulfilling on a personal level. The foundation running the donation took pains to make it as convenient as possible - and free, other than my time. I recovered quickly and have had no long-term issues related to the donation[1]. I would encourage everyone to at least do the cheek swab to join the registry if they are able. this page to join the Be The Match registry. This post was prompted - very belatedly - by a comment from "demost_" on Scott Alexander's post about his experience donating a kidney[2]. The commenter was speculating about the differences between bone marrow donation and kidney donation[3]. I'm typically a lurker, but I figured this is a case where I actually do have something to say[4]. According to demost_, fewer than 1% of those on the bone marrow registry get matched, so my experience is relatively rare. I checked and couldn't find any other forum posts about being a blood stem cell or bone marrow donor. I hope to shine a light on what the experience is like as a donor. I know EAs are supposed to be motivated by cold, hard facts and rationality and so this post may stick out since it's recounting a personal experience[5]. Nevertheless, given how close-to-home matters of health are, I figured this could be useful for those considering joining the registry or donating. My Donation Experience I joined the registry toward the end of my college years. I don't recall the exact details, but I've pieced together the timeline from my email archives. Be The Match got my cheek swab sample in December 2019 and I officially joined the registry in January 2020. If you're a university student (at least in America[6]), there's a good chance that at some point there will be a table in your commons or quad where volunteers will be offering cheek swabs to join the bone marrow donor registry. The whole process takes a few minutes and I'd encourage everyone to at least join the registry if they can. Mid-December 2020, I was matched and started the donation process. For the sake of privacy, they don't tell you anything about the recipient at that point beyond the vaguest possible demographic info. I think they told me the gender and an age range, but nothing besides. demost_ supposed that would-be donors should be more moved to donate bone marrow than kidneys since there's a particular, identifiable person in need (and marrow is much more difficult to match, so you're less replaceable as a donor). I can personally attest to this. Even though I didn't know much about the recipient at all, I felt an extreme moral obligation to see the process through. I knew that my choice to donate could make a massive difference to this person. I imagined how I would feel if it were a friend or loved one in need or even myself. The minor inconveniences of donating felt doubly minor next to the weight of someone's life. As a college student, I had a fluid schedule. I was also fortunate that my distributed systems professor was happy to let me defer an exam scheduled for the donation date. To their credit, Be The Match offered not only to compensate any costs associated with the donation, but also to replace any wages missed...
undefined
Jan 3, 2024 • 21min

EA - Why EA should (probably) fund ceramic water filters by Bernardo Baron

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why EA should (probably) fund ceramic water filters, published by Bernardo Baron on January 3, 2024 on The Effective Altruism Forum. Epistemic status: after researching for more than 80 hours each, we are moderately certain that ceramic filters (CFs) can be more cost-effective than chlorination to prevent waterborne diseases at least in some - and possibly in many - LMICs. We are less certain of the real size of the effects from CFs, and how some factors like household sizes affect the final cost-effectiveness. At least 1.7 billion people globally used drinking water sources contaminated with feces in 2022, leading to significant health risks from waterborne enteric infections. According to the Global Burden of Disease (GBD) 2019 study, more than 2.5% of total DALYs lost that year were linked to unsafe water consumption - and there is some evidence that this burden can be even bigger. This makes the improvement of access to clean water a particularly pressing problem in the Global Health and Development area. As a contribution to target this problem, we have put together a report on ceramic water filters as a potential intervention to improve access to safe water in low and medium income countries. This was written during our time as research fellows at Charity Entrepreneurship's Research Training Program (Fall 2023). In this post, we summarize the main findings of the report. Nonetheless, we invite people interested in the subject to check out the full report, which provides much more detail into each topic we outline here. Key takeaways: There are several (controlled, peer-reviewed) studies that link the distribution of ceramic filters to less frequent episodes of diarrhea in LMICs. Those studies have been systematically reviewed and graded low to medium quality. Existing evidence supports the hypothesis that ceramic filters are even more effective than chlorination to reduce diarrhea episodes. However, percentage reductions here should be taken with a grain of salt due to lack of masking and self-report and publication biases. Despite limitations in current evidence, we are cautiously optimistic that ceramic filters can be more cost-effective than chlorination, especially in countries where diarrheal diseases are primarily caused by bacteria and protozoa (and not by viruses). Average household sizes can also play a role, but we are less certain on the extent to which this is true. We provide a Geographic Weighted Factor Model and a country-specific back-of-the envelope analysis of the cost-effectiveness for a hypothetical charity that wants to distribute free ceramic filters in LMICs. Our central scenario for the cost-effectiveness of the intervention in the top prioritized country (Nigeria) is $8.47 U.S. dollars per DALY-averted. We ultimately recommend that EA donors and meta-organizations should invest at least some resources in the distribution of ceramic filters, either by bringing up new charities in this area, or by supporting existing, non-EA organizations that already have lots of expertise in how to manufacture, distribute and monitor the usage of the filters. Why ceramic filters? There are plenty of methods to provide access to safe(r) water in very low-resource settings. Each one of those has some pros and cons, but ceramic filters stand out for being cheap to make, easy to install and operate, effective at improving health, and durable (they are said to last for a minimum of 2 years). In short, a ceramic filter is a combination of a porous ceramic element and a recipient for the filtered water (usually made of plastic). Water is manually put into the ceramic part and flows through its pores due to gravity. Since pores are very small, they let water pass, but physically block bigger particles - including bacteria, protozoa and sediments - from passing....
undefined
Jan 3, 2024 • 3min

LW - Trading off Lives by jefftk

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Trading off Lives, published by jefftk on January 3, 2024 on LessWrong. Let's say someone proposes that to reduce deaths from overly chaotic airplane evacuations we ban passenger distractions during the most dangerous parts: takeoff and landing. How could we decide whether a ban like this would be worth it? The argument for the ban is that the safe window for evacuating a plane can be very narrow, and evacuation could potentially go better if everyone were alert. For example, in the 2005 AF358 disaster the plane was completely on fire within ~3min of landing. While I think the benefit of a ban would likely be even smaller, let's assume that global adoption of a ban would cause an average of one fewer person a year to die. On the other side, there's the cost of ~10min of boredom, for every passenger, on every flight. Instead of playing games, watching movies, or reading, people would mostly be talking, looking out the window, or staring off into space. One common reaction is to say that on one side of this ledger we have someone's life, while on the other side we have a bit of boredom, so of course we should go with the policy that saves lives. Is there any amount of minor boredom that could equal a life? Many of us have a sense that there are some kinds of tradeoffs that you just shouldn't make, such as accepting deaths in exchange for reducing inconvenience. If you take that perspective seriously, however, you'll have somewhat fewer deaths and unbearable levels of inconvenience. We could we could prohibit radios in cars because the music and adjustment can lead to collisions. Set the highway speed limit to 25mph. Ban cars entirely since they're more dangerous than walking and public transport. Require an N95 indoors at all times. Ban paternosters. Limit swimming pools to 3ft deep. In our normal lives we make these kinds of tradeoff all the time, for example in deciding whether to drive somewhere: you have about a 1 in a million chance of dying ("one micromort") for each 175mi in a car. Thinking through this kind of more normal tradeoff can give better intuitions for approaching more unusual ones like airline policies; let's try that here. There are ~9B passengers annually, so one fewer death would save the average passenger ~0.0001 micromort at a cost of ~10min of boredom. Is that a good trade? Imagine you were choosing between two potential ~10min car journeys: one being 6mi and one being 200ft shorter but you're not allowed to use your phone, read a book, listen to music, etc. I think nearly everyone would chose the extra 200ft, no? At one micromort per 175mi, avoiding 200ft saves you ~0.0002 micromorts. This ~2x what we're positing travelers would save by making a similar trade on planes. If you wouldn't give up 10min of reading to save 200ft in a car, it's probably not worth doing to make flight safer either. Comment via: facebook, mastodon Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Jan 3, 2024 • 47min

EA - How We Plan to Approach Uncertainty in Our Cost-Effectiveness Models by GiveWell

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How We Plan to Approach Uncertainty in Our Cost-Effectiveness Models, published by GiveWell on January 3, 2024 on The Effective Altruism Forum. Author: Adam Salisbury, Senior Research Associate Summary In a nutshell We've received criticism from multiple sources that we should model uncertainty more explicitly in our cost-effectiveness analyses. These critics argue that modeling uncertainty, via Monte Carlos or other approaches, would keep us from being fooled by the optimizer's curse[1] and have other benefits. Our takeaways: We think we're mostly addressing the optimizer's curse already by skeptically adjusting key model inputs, rather than taking data at face value. However, that's not always true, and we plan to take steps to ensure we're doing this more consistently. We also plan to make sensitivity checks on our parameters and on bottom-line cost-effectiveness a more routine part of our research. We think this will help surface potential errors in our models and have other transparency and diagnostics benefits. Stepping back, we think taking uncertainty more seriously in our work means considering perspectives beyond our model, rather than investing more in modeling. This includes factoring in external sources of evidence and sense checks, expert opinion, historical track records, and qualitative features of organizations. Ways we could be wrong: We don't know if our parameter adjustments and approach to addressing the optimizer's curse are correct. Answering this question would require comparing our best guesses to "true" values for parameters, which we typically don't observe. Though we think there are good reasons to consider outside-the-model perspectives, we don't have a fully formed view of how to bring qualitative arguments to bear across programs in a consistent way. We expect to consider this further as a team. What is the criticism we've received? In our cost-effectiveness analyses, we typically do not publish uncertainty analyses that show how sensitive our models are to specific parameters or uncertainty ranges on our bottom line cost-effectiveness estimates. We've received multiple critiques of this approach: Noah Haber argues that, by not modeling uncertainty explicitly, we are subject to the optimizer's curse. If we take noisy effect sizes, burden, or cost estimates at face value, then the programs that make it over our cost-effectiveness threshold will be those that got lucky draws. In aggregate, this would make us biased toward more uncertain programs. To remedy this, he recommends that (i) we quantify uncertainty in our models by specifying distributions on key parameters and then running Monte Carlo simulations and (ii) we base decisions on a lower bound of the distribution (e.g., the 20th percentile). Others[2] have argued we're missing out on other benefits that come from specifying uncertainty. By not specifying uncertainty on key parameters or bottom line cost-effectiveness, we may be missing opportunities to prioritize research on the parameters to which our model is most sensitive and to be fully transparent about how uncertain our estimates are. (more) What do we think about this criticism? We think we're mostly guarding against the optimizer's curse by skeptically adjusting key inputs in our models, but we have some room for improvement. The optimizer's curse would be a big problem if we, e.g., took effect sizes from study abstracts or charity costs at face value, plugged them into our models, and then just funded programs that penciled above our cost-effectiveness bar. We don't think we're doing this. For example, in our vitamin A supplementation cost-effectiveness analysis (CEA), we apply skeptical adjustments to treatment effects to bring them closer to what we consider plausible. In our CEAs more broadly, we triangulate our cost e...
undefined
Jan 2, 2024 • 7min

LW - AI Is Not Software by Davidmanheim

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI Is Not Software, published by Davidmanheim on January 2, 2024 on LessWrong. Epistemic Status: This idea is, I think, widely understood in technical circles. I'm trying to convey it more clearly to a general audience. Edit: See related posts like this one by Eliezer for background on how we should use words. What we call AI in 2024 is not software. It's kind of natural to put it in the same category as other things that run on a computer, but thinking about LLMs, or image generation, or deepfakes as software is misleading, and confuses most of the ethical, political, and technological discussions. This seems not to be obvious to many users, but as AI gets more widespread, it's especially important to understand what we're using when we use AI. Software Software is how we get computers to work. When creating software, humans decide what they want the computer to do, think about what would make the computer do that, and then write an understandable set of instructions in some programming language. A computer is given those instructions, and they are interpreted or compiled into a program. When that program is run, the computer will follow the instructions in the software, and produce the expected output, if the program is written correctly. Does software work? Not always, but if not, it fails in ways that are entirely determined by the human's instructions. If the software is developed properly, there are clear methods to check each part of the program. For example, unit tests are written to verify that the software does what it is expected to do in different cases. The set of cases are specified in advance, based on what the programmer expected the software to do. If it fails a single unit test, the software is incorrect, and should be fixed. When changes are wanted, someone with access to the source code can change it, and recreate the software based on the new code. Given that high-level description, it might seem like everything that runs on a computer must be software. In a certain sense, it is, but thinking about everything done with computers as software is unhelpful or misleading. This essay was written on a computer, using software, but it's not software. And the difference between what is done on a computer and what we tell a computer to do with software is obvious in cases other than AI. Once we think about what computers do, and what software is, we shouldn't confuse "on a computer" with software. Not Software For example, photos of a wedding or a vacation aren't software, even if they are created, edited, and stored using software. When photographs are not good, we blame the photographer, not the software running on the camera. We don't check if the photography or photo editing worked properly by rerunning the software, or building unit tests. When photographs are edited or put into an album, it's the editor doing the work. If it goes badly, the editor chose the wrong software, or used it badly - it's generally not the software malfunctioning. If we lose the photographs, it's almost never a software problem. And if we want new photographs, we're generally out of luck - it's not a question of fixing the software. There's no source code to rerun. Having a second wedding probably shouldn't be the answer to bad or lost photographs. And having a second vacation might be nice, but it doesn't get you photos of the first vacation. Similarly, a video conference runs on a computer, but the meeting isn't software - software is what allows it to run. A meeting can go well, or poorly, because of the preparation or behavior of the people in the meeting. (And that isn't the software's fault!) The meeting isn't specified by a programming language, doesn't compile into bytecode, and there aren't generally unit tests to check if the meeting went well. And when we want to ...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app