The Nonlinear Library

The Nonlinear Fund
undefined
Jan 17, 2024 • 46min

LW - Medical Roundup #1 by Zvi

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Medical Roundup #1, published by Zvi on January 17, 2024 on LessWrong. Saving up medical and health related stories from several months allowed for much better organizing of them, so I am happy I split these off. I will still post anything more urgent on a faster basis. There's lots of things here that are fascinating and potentially very important, but I've had to prioritize and focus elsewhere, so I hope others pick up various torches. Vaccination Ho! We have a new malaria vaccine. That's great. WHO thinks this is not an especially urgent opportunity, or any kind of 'emergency' and so wants to wait for months before actually putting shots into arms. So what if we also see reports like 'cuts infant deaths by 13%'? WHO doing WHO things, WHO Delenda Est and all that. What can we do about this? Also, EA and everyone else who works in global health needs to do a complete post-mortem of how this was allowed to take so long, and why they couldn't or didn't do more to speed things along. There are in particular claims that the 2015-2019 delay was due to lack of funding, despite a malaria vaccine being an Open Phil priority. Saloni Dattani, Rachel Glennerster and Siddhartha Haria write about the long road for Works in Progress. They recommend future use of advance market commitments, which seems like a no brainer first step. We also have an FDA approved vaccine for chikungunya. Oh, and also we invented a vaccine for cancer, a huge boost to melanoma treatment. Katalin Kariko and Drew Weissman win the Nobel Prize for mRNA vaccine technology. Rarely are such decisions this easy. Worth remembering that, in addition to denying me admission despite my status as a legacy, the University of Pennsylvania also refused to allow Kariko a tenure track position, calling her 'not of faculty quality,' and laughed at her leaving for BioNTech, especially when they refer to this as 'Penn's historic research team.' Did you also know that Katalin's advisor threatened to have her deported if she switched labs, and attempted to follow through on that threat? I also need to note the deep disappointment in Elon Musk, who even a few months ago was continuing to throw shade on the Covid vaccines. And what do we do more generally about the fact that there are quite a lot of takes that one has reason to be nervous to say out loud, seem likely to be true, and also are endorsed by the majority of the population? When we discovered all the vaccines. Progress continues. We need to go faster. Reflections on what happened with medical start-up Alvea. They proved you could move much faster on vaccine development than anyone would admit, but then found that there was insufficient commercial or philanthropic demand for doing so to make it worth everyone's time, so they wound down. As an individual and as a civilization, you get what you pay for. Potential Progress Researchers discover what they call an on/off switch for breast cancer. Not clear yet how to use this to help patients. London hospital uses competent execution on basic 1950s operations management, increases surgical efficiency by a factor of about five. Teams similar to a Formula 1 pit crew cut sterilization times from 40 minutes to 2. One room does anesthesia on the next patient while the other operates on the current one. There seems to be no reason this could not be implemented everywhere, other than lack of will? Dementia rates down 13% over the past 25 years, for unclear reasons. Sarah Constantin explores possibilities for cognitive enhancement. We have not yet tried many of the things one would try. We found a way to suppress specific immune reactions, rather than having to suppress immune reactions in general, opening up the way to potentially fully curing a whole host of autoimmune disorders. Yes, in mice, of course it's in mice, so don't ge...
undefined
Jan 17, 2024 • 11min

LW - Why wasn't preservation with the goal of potential future revival started earlier in history? by Andy McKenzie

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why wasn't preservation with the goal of potential future revival started earlier in history?, published by Andy McKenzie on January 17, 2024 on LessWrong. Cross-posted from my blog, Neurobiology Notes. John Hunter (1728-1793) did not have an especially promising start to his academic life. He was born the youngest of 10 children to a family living in the countryside near Glasgow. They lived in a two bedroom cottage and the children slept in box beds that were pulled out of the walls every night. He was stubborn, hated school, did not like to be taught reading or writing, would skip classes whenever he could, and quit formal education altogether at 13, the same year his father died. He said that he "totally rejected books," instead preferring to gain practical knowledge first hand. He spent his time helping with the family farm. When he was 20, he made the fateful decision to join his brother William Hunter's anatomy school in London as an assistant. maternal and fetal circulations are separate, invent the technique of proximal ligation to treat aneurysms, either inoculate himself or someone else with venereal disease purely in the name of science, coordinate the first documented artificial insemination, propose the gradual formation of new species due to random variations 70 years before Darwin, create a school providing lectures in physiology, make enemies with all of the other surgeons at his hospital, almost die when he was attacked by one of his many exotic animals, amass a huge collection of specimens that he spent nearly all his money on and that remains in London today, and become the person widely considered the founder of modern scientific surgery. a photo from the Hunterian museum in London I learned this all from Wendy Moore's excellent biography of John Hunter, The Knife Man: The Knife Man by Wendy Moore Although I'm a closet Anglophile, the main reason I picked this book up is because Hunter also seems to have been one of the first people, if not the first person, to seriously research suspended animation. Suspended animation is a hypothetical procedure in which a person or other animal could be preserved for a long period of time in a way that the procedure is known to be reversible, allowing for reanimation at the time of one's choosing. Suspended animation is not the same as cryonics, because in cryonics, it is not known whether the preservation will ever be reversible, so a cryonics procedure relies on the possibility of bootstrapped advances in future technology that might allow reversibility. Hunter was interested in suspended animation for a number of reasons, including because he was interested in the dividing line between life and death, and because he thought it might make him rich. He also thought that it might be practically useful: Till this time I had imagined that it might be possible to prolong life to any period by freezing a person in the frigid zone, as I thought all action and waste would cease until the body was thawed. I thought that if a man would give up the last ten years of his life to this kind of alternate oblivion and action, it might be prolonged to a thousand years; and by getting himself thawed every hundred years, he might learn what had happened during his frozen condition. In 1766, Hunter performed an experiment to test this. He placed two carp in a glass vessel with water. He then kept adding cold snow to the vessel. At first the snow repeatedly melted, but eventually the water around the fish froze. He thawed them slowly, but found they did "not recover action, so that they were really dead." Benjamin Franklin had similar ideas. In the cryonics community, Franklin's remarkable letter to a friend in 1773 is kind of famous: I have seen an instance of common flies preserved in a manner somewhat similar. They had been ...
undefined
Jan 17, 2024 • 4min

EA - EA Infrastructure Fund Ask Us Anything (January 2024) by Tom Barnes

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: EA Infrastructure Fund Ask Us Anything (January 2024), published by Tom Barnes on January 17, 2024 on The Effective Altruism Forum. The EA Infrastructure Fund (EAIF) is running an Ask Us Anything! This is a time where EAIF grantmakers have set aside some time to answer questions on the Forum. I (Tom) will aim to answer most questions next weekend (~January 20th), so please submit questions by the 19th. Please note: We believe the next three weeks are an especially good time to donate to EAIF, because: We continue to face signficant funding constraints, leading to many great projects going either unfunded or underfunded Your donation will be matched at a 2:1 ratio until Feb 2. EAIF has ~$2m remaining in available matching funds, meaning that (unlike LTFF) this match is unlikely to be utilised without your support If you agree, you can donate to us here. About the Fund The EA Infrastructure Fund aims to increase the impact of projects that use the principles of effective altruism, by increasing their access to talent, capital, and knowledge. Over 2022 and H1 2023, we made 347 grants totalling $13.4m in dispersement. You can see our public grants database here. Related posts EA Infrastructure Fund's Plan to Focus on Principles-First EA LTFF and EAIF are unusually funding-constrained right now EA Funds organizational update: Open Philanthropy matching and distancing EA Infrastructure Fund: June 2023 grant recommendations What do Marginal Grants at EAIF Look Like? Funding Priorities and Grantmaking Thresholds at the EA Infrastructure Fund About the Team Tom Barnes: Tom is currently a Guest Fund Manager at EA Infrastructure Fund (previously an Assistant Fund Manager since ~Oct 2022). He also works as an Applied Researcher at Founders Pledge, currently on secondment to the UK Government to work on AI policy. Previously, he was a visiting fellow at Rethink Priorities, and was involved in EA uni group organizing. Caleb Parikh: Caleb is the project lead of EA Funds. Caleb has previously worked on global priorities research as a research assistant at GPI, EA community building (as a contractor to the community health team at CEA), and global health policy. Caleb currently leads EAIF as interim chair. Linchuan Zhang: Linchuan (Linch) Zhang currnetly works full-time at EA Funds. He was previously a Senior Researcher at Rethink Priorities working on existential security research. Before joining RP, he worked on time-sensitive forecasting projects around COVID-19. Previously, he programmed for Impossible Foods and Google and has led several EA local groups. Ask Us Anything We're happy to answer any questions - marginal uses of money, how we approach grants, questions/critiques/concerns you have in general, what reservations you have as a potential donor or applicant, etc. There's no hard deadline for questions, but I would recommend submitting by the 19th January as I aim to respond from the 20th As a reminder, we remain funding-constrained, and your donation will be matched (for every $1 you donate, EAIF will receive $3). Please consider donating! If you have projects relevant to builiding up the EA community's infrastructure, you can also apply for funding here. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Jan 17, 2024 • 53min

LW - Being nicer than Clippy by Joe Carlsmith

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Being nicer than Clippy, published by Joe Carlsmith on January 17, 2024 on LessWrong. (Cross-posted from my website. Podcast version here, or search "Joe Carlsmith Audio" on your podcast app. This essay is part of a series I'm calling "Otherness and control in the age of AGI." I'm hoping that the individual essays can be read fairly well on their own, but see here for a summary of the essays that have been released thus far.) In my last essay, I discussed a certain kind of momentum, in some of the philosophical vibes underlying the AI risk discourse,[1] towards deeming more and more agents - including: human agents - "misaligned" in the sense of: not-to-be-trusted to optimize the universe hard according to their values-on-reflection. We can debate exactly how much mistrust to have in different cases, here, but I think the sense in which AI risk issues can extend to humans, too, can remind us of the sense in which AI risk is substantially (though, not entirely) a generalization and intensification of the sort of "balance of power between agents with different values" problem we already deal with in the context of the human world. And I think it may point us towards guidance from our existing ethical and political traditions, in navigating this problem, that we might otherwise neglect. In this essay, I try to gesture at a part of these traditions that I see as particularly important: namely, the part that advises us to be "nicer than Clippy" - not just in what we do with spare matter and energy, but in how we relate to agents-with-different-values more generally. Let me say more about what I mean. Utilitarian vices As many have noted, Yudkowsky's paperclip maximizer looks a lot like total utilitarian. In particular, its sole aim is to "tile the universe" with a specific sort of hyper-optimized pattern. Yes, in principle, the alignment worry applies to goals that don't fit this schema (for example: "cure cancer" or "do god-knows-whatever kludge of weird gradient-descent-implanted proxy stuff"). But somehow, especially in Yudkowskian discussions of AI risk, the misaligned AIs often end up looking pretty utilitarian-y, and a universe tiled with something - and in particular, "tiny-molecular-blahs" - often ends seeming like a notably common sort of superintelligent Utopia. What's more, while Yudkowsky doesn't think human values are utilitarian, he thinks of us (or at least, himself) as sufficiently galaxy-eating that it's easy to round off his "battle of the utility functions" narrative into something more like a "battle of the preferred-patterns" - that is, a battle over who gets to turn the galaxies into their favored sort of stuff. ChatGPT imagines "tiny molecular fun." But actually, the problem Yudkowsky talks about most - AIs killing everyone - isn't actually a paperclips vs. Fun problem. It's not a matter of your favorite uses for spare matter and energy. Rather, it's something else. Thus, consider utilitarianism. A version of human values, right? Well, one can debate. But regardless, put utilitarianism side-by-side with paperclipping, and you might notice: utilitarianism is omnicidal, too - at least in theory, and given enough power. Utilitarianism does not love you, nor does it hate you, but you're made of atoms that it can use for something else. In particular: hedonium (that is: optimally-efficient pleasure, often imagined as running on some optimally-efficient computational substrate). But notice: did it matter what sort of onium? Pick your favorite optimal blah-blah. Call it Fun instead if you'd like (though personally, I find the word "Fun" an off-putting and under-selling summary of Utopia). Still, on a generalized utilitarian vibe, that blah-blah is going to be a way more optimal use of atoms, energy, etc than all those squishy inefficient human bodies. The...
undefined
Jan 16, 2024 • 13min

EA - EA Nigeria: Reflecting on 2023 and Looking Ahead to 2024 by EA Nigeria

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: EA Nigeria: Reflecting on 2023 and Looking Ahead to 2024, published by EA Nigeria on January 16, 2024 on The Effective Altruism Forum. Summary EA Nigeria works to create an altruistic, supportive, and informed community of people in Nigeria who use evidence-based and reasoned approaches to distribute their resources, be it career, money, or other resources, to maximize their positive impact. EA Nigeria shares impactful resources and facilitates networking, knowledge sharing, skill building, and collaboration among its community. In 2023, we: Conducted three rounds of an introductory fellowship program, graduating 28 participants. Conducted four rounds of skill-building workshops with an active participation of 24 members. Organized an annual community retreat, fostering engagement with 28 enthusiastic participants. Published a monthly newsletter for nine consecutive months, gaining momentum with 482 subscribers by December 2023. Facilitated five community insight calls, promoting knowledge sharing and drawing a total attendance of 52 individuals. Delivered sixteen personalized guidance and networking connections, enhancing the impact of our support initiatives. Updated the opportunity board from June to December 2023 with 120 accessible opportunities to our community in Nigeria. In 2024, Our key strategies are: Improving the infrastructure and capacity Conduct rounds of skilling workshops. Conduct rounds of the EA intro program. Explore the accelerator program. Enhancement of engagement and retention Facilitating knowledge-sharing calls and continuous personalized guidance. Updating opportunity board weekly and a bi-monthly newsletter. Conduct annual community retreat. Offering continuous support to local groups and student clubs. Outreach and professional growth Recruiting additional members through fellowships, events, and outreach. Set up a donation page and explore fundraising for the aligned charity locally Explore fiscal sponsorship for aligned projects and individuals. About EA Nigeria: Vision, Mission, and Strategy Founded in 2020, EA Nigeria is a national chapter of the global Effective Altruism community in Nigeria, officially incorporated as the "Impactful Altruism Initiative" by the Corporate Affairs Commission of Nigeria in 2023. Our vision is a cultural setting where resources are distributed effectively for maximum impact, with the mission of building an altruistic, supportive, and informed community. Our current strategy are: Improving infrastructure, structure, and capacity. Enhancing community engagement and retention. Outreach and professional growth Activities for Infrastructure and Capacity enhancement include: Education and skilling program: This involves single or multi-day workshops designed to enhance both capacity and ability. These workshops cover a spectrum of essential areas, such as career planning, high-impact research, and other relevant skill-building focuses. Introductory fellowship program: Crafted to deepen understanding of the core ideas and principles of effective altruism among participants. Mentorship and networking pairing: Forging networking and collaboration to empower individuals within the community for knowledge exchange, action, etc. Activities for Amplifying Community Engagement and Retention Opportunity Board Updates: A dynamic opportunity board updated weekly, presenting relevant and accessible opportunities for our members. Community Insight Calls: Providing a discussion platform for members to exchange knowledge, socialize, and deepen their engagement with other community members. Retreat Event: Organize to increase community engagement and impactful value-aligned practice through knowledge exchange and networking for improved awareness and informed decisions. Guidance and Information: Delivering guidance and inform...
undefined
Jan 16, 2024 • 17min

AF - Managing catastrophic misuse without robust AIs by Ryan Greenblatt

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Managing catastrophic misuse without robust AIs, published by Ryan Greenblatt on January 16, 2024 on The AI Alignment Forum. Many people worry about catastrophic misuse of future AIs with highly dangerous capabilities. For instance, powerful AIs might substantially lower the bar to building bioweapons or allow for massively scaling up cybercrime. How could an AI lab serving AIs to customers manage catastrophic misuse? One approach would be to ensure that when future powerful AIs are asked to perform tasks in these problematic domains, the AIs always refuse. However, it might be a difficult technical problem to ensure these AIs refuse: current LLMs are possible to jailbreak into doing arbitrary behavior, and the field of adversarial robustness, which studies these sorts of attacks, has made only slow progress in improving robustness over the past 10 years. If we can't ensure that future powerful AIs are much more robust than current models[1], then malicious users might be able to jailbreak these models to allow for misuse. This is a serious concern, and it would be notably easier to prevent misuse if models were more robust to these attacks. However, I think there are plausible approaches to effectively mitigating catastrophic misuse which don't require high levels of robustness on the part of individual AI models. (In this post, I'll use "jailbreak" to refer to any adversarial attack.) In this post, I'll discuss addressing bioterrorism and cybercrime misuse as examples of how I imagine mitigating catastrophic misuse[2]. I'll do this as a nearcast where I suppose that scaling up LLMs results in powerful AIs that would present misuse risk in the absence of countermeasures. The approaches I discuss won't require better adversarial robustness than exhibited by current LLMs like Claude 2 and GPT-4. I think that the easiest mitigations for bioterrorism and cybercrime are fairly different, because of the different roles that LLMs play in these two threat models. The mitigations I'll describe are non-trivial, and it's unclear if they will happen by default. But regardless, this type of approach seems considerably easier to me than trying to achieve very high levels of adversarial robustness. I'm excited for work which investigates and red-teams methods like the ones I discuss. [Thanks to Fabien Roger, Ajeya Cotra, Nate Thomas, Max Nadeau, Aidan O'Gara, and Ethan Perez for comments or discussion. This post was originally posted as a comment in response to this post by Aidan O'Gara; you can see the original comment here for reference. Inside view, I think most research on preventing misuse seems less leveraged (for most people) than preventing AI takeover caused by catastrophic misalignment; see here for more discussion. Mitigations for bioterrorism In this section, I'll describe how I imagine handling bioterrorism risk for an AI lab deploying powerful models (e.g., ASL-3/ASL-4). As I understand it, the main scenario by which LLMs cause bioterrorism risk is something like the following: there's a team of relatively few people, who are not top experts in the relevant fields but who want to do bioterrorism for whatever reason. Without LLMs, these people would struggle to build bioweapons - they wouldn't be able to figure out various good ideas, and they'd get stuck while trying to manufacture their bioweapons (perhaps like Aum Shinrikyo). But with LLMs, they can get past those obstacles. (I'm making the assumption here that the threat model is more like "the LLM gives the equivalent of many hours of advice" rather than "the LLM gives the equivalent of five minutes of advice". I'm not a biosecurity expert and so don't know whether that's an appropriate assumption to make; it probably comes down to questions about what the hard steps in building catastrophic bioweapons are. And s...
undefined
Jan 16, 2024 • 9min

EA - Giving Farm Animals a Name and a Face: The Power of The Identifiable Victim Effect by Rakefet Cohen Ben-Arye

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Giving Farm Animals a Name and a Face: The Power of The Identifiable Victim Effect, published by Rakefet Cohen Ben-Arye on January 16, 2024 on The Effective Altruism Forum. In this post, we provide an overview of our recent scientific paper, "Giving Farm Animals a Name and a Face: Eliciting Animal Advocacy among Omnivores using the Identifiable Victim Effect," which was published in the Journal of Environmental Psychology. We delve into the findings of our study with Dr. Eliran Halali, highlighting the benefits of telling the story of a single identifiable individual and its implications for future research on animal advocacy. Introduction In an era where we are no longer dependent on animal protein and can survive and even thrive on plant-based nutrition - a diet that is increasingly recognized for its health (Melina, Craig, and Levin 2016) and environmental benefits (Ranganathan et al. 2016), our study "Giving Farm Animals a Name and a Face" explores a unique approach to animal advocacy. We investigate whether the identifiable victim effect, a well-documented phenomenon in eliciting prosocial behavior (Small and Loewenstein, 2003), can be leveraged to promote empathy and action toward farm animals among omnivores. The Identifiable Victim Effect Previous research has shown that stories about a single, identifiable victim are more effective in evoking prosocial affect and behavior than information about anonymous or statistical victims (Jenni and Loewenstein 1997; Small, Loewenstein, and Slovic 2007; Kogut and Ritov 2005b, [a] 2005). This phenomenon, known as the identifiable victim effect, although usually accompanied by a photo or a video of the identifiable victim, suggests that even minimal identifiability can significantly increase caring and donations (Small and Loewenstein 2003). Our research expands on this concept, exploring its application in animal advocacy and, mainly, whether one can elicit compassion for farm animals among omnivores. The Identifiable Animal Victim Effect Research on the identifiable victim effect, primarily focused on human beneficiaries, has only recently expanded to animal victims. Studies explored this effect with endangered animals and climate crisis (Markowitz et al. 2013; Hsee and Rottenstreich 2004). Markowitz's study (2013) revealed that non-environmentalists were more likely to donate to a single identified animal victim, such as a panda than a group. However, this effect was not as prominent among environmentalists, possibly due to their already high prosocial intentions. These findings suggest that the identifiable victim effect can be a crucial factor in animal advocacy, highlighting the unique impact of emotional connection to a single, identifiable animal. Our study uniquely challenges the identifiable victim effect by focusing on omnivores, who are the very reason the victim needs help in the first place. Method Participants were exposed to an experimental intervention and answered questionnaires. Intervention Lucky's story. Drawing inspiration from real-life cases, we centered on Lucky, a fictional calf who was given a name and a face (picture), or unidentified calves without a name and a face. Potential mechanisms Sympathy. For example, "Lucky's (The farm animals') story made me very sad." Personal distress. For example, "I felt sympathy toward Lucky (the farm animal)." Ambivalence towards meat. For example, "I feel torn between the two sides of eating meat." Potential conditions Concern. For example, "When I see someone being taken advantage of, I feel kind of protective towards them." Perspective-taking. For example: "I believe that there are two sides to every question and try to look at them both." Empathy. For example: "If I see someone fidgeting, I'll start feeling anxious too." Identification with animals. Compos...
undefined
Jan 16, 2024 • 12min

EA - Meta Charity Funders: Summary of Our First Grant Round and Path Forward by Joey

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Meta Charity Funders: Summary of Our First Grant Round and Path Forward, published by Joey on January 16, 2024 on The Effective Altruism Forum. We think this post will be relevant for people who want to apply to Meta Charity Funders (MCF) in the future and people who want to better understand the EA Meta funding landscape. The post is written by the organisers of MCF (who are all authors of this post).Some of our members might not agree with everything said. Summary Meta Charity Funders (MCF) is a new funding circle that aims to fund charitable projects working one level removed from direct impact. In our first grant round spanning Aug-Oct 2023, we received 101 applications and ultimately funded 6 projects: Future Forward, Ge Effektivt, Giving What We Can, An anonymous GCR career transition initiative, promoting Peter Singer's work, and UHNW donation advisory. In total, our members gave $686,580 to these projects. We expect our next round to give 20% to 50% more than this amount, as our first round had less donor engagement and funding capacity than we expect in the future. giving multipliers" that help grow the pie of effective donations. Our grant-making process this round MCF was launched at the end of July, 2023 and applications closed a month later, at the end of August. Over two months, our funding circle convened every two weeks to collaboratively decide on funding allocations, with individual members devoting additional time for evaluation between meetings. Our active members, composed of 9 individuals, undertook this project alongside their regular commitments. From the 101 applications received, the main organizers conducted an initial review. This process was aimed at creating a short(er) list of applications for more time-constrained members, by rather quickly determining if proposals were within scope, with a relevant approach and aligned team. This first stage resulted in 38 proposals advancing for further discussion, out of which 20 applicants were interviewed for more detailed insights. As the funding decisions approached in October, it became clear that many in our circle were nearing their annual donation limits or had less time than expected, which affected our final funding capacity. Ultimately, we funded 6 projects with total allocations of $686,580. See more about the grants we made below. While we are generally happy with this first round and very grateful for the many great applications and donors who have joined, we think we have significant room for growth and improvement. Most concretely, we hope and expect to give out more in future rounds; there were fewer active donating members in the circle this first round and several had already made their donations for the year. We also hope and expect to form and communicate a clearer scope of our funding priorities and make final grant decisions sooner within each round. Information for the next round The next round will open in late February, with grants given out in May. The application form will remain open but don't expect your application to be processed before March. We were generally excited about the applications we received for this round and hope that we will get similar applications in the next round as well. If you want to join Meta Charity Funders as a donor, please fill in this form. Note that there is an expected annual donation amount of a minimum $100,000, but you obviously do not have to donate if you do not think there are good enough opportunities, and during the first year you can mainly observe. If you have any questions, please contact us at metacharityfunders@gmail.com. Check out our website to learn more about Meta Charity Funders and stay up-to-date with the new funding round. The most common reasons for rejection By sharing the most common reasons for rejections, we hope...
undefined
Jan 16, 2024 • 21min

LW - The impossible problem of due process by mingyuan

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The impossible problem of due process, published by mingyuan on January 16, 2024 on LessWrong. I wrote this entire post in February of 2023, during the fallout from the TIME article. I didn't post it at the time for multiple reasons: because I had no desire to get involved in all that nonsense because I was horribly burned out from my own community conflict investigation and couldn't stand the thought of engaging with people online because I generally think it's bad to post on the internet out of frustration or outrage But after sitting on it for a full year, I still think it's worth posting, so here it is. The only edits I have made since February 16th, 2023, were to add a couple interstitial sentences for clarity, and change 'recent articles' to 'articles from February 2023'. So, it's not intended to be commenting on anything more recent than that. I am precommitting to not engaging with any comments, because I am mostly offline and I think that is good. I probably won't even look at this post again for several weeks. Do what you will. Here is the post: Note: I am erring on the side of not naming any names in this article. There is one exception for the sake of clarity. In my time overseeing the global rationalist community, living in the Bay community, and just generally being a person, I've seen a lot of people face up to complicated conflicts. People often get really mad at each other for mishandling these cases, and will sometimes publicly point to these failures as reasons to condemn a person or group. However, I challenge you to point to a single entity in the world that has figured out a process for handling non-criminal misconduct that you would be happy with no matter whether you were the aggrieved or the accused party. Maybe such a thing exists, but if so I have not heard of it. This post is a survey of the different ways that people try to resolve community conflicts, and the ways that each of them fail. Committees/panels In cases of major conflict or disagreement, it often seems like the right thing to do to convene a panel of impartial judges and have them hear all the evidence. I personally know of at least seven specific cases of this happening in the rationalist community. Here are some of the problems with this approach. Investigations eat up hundreds of person-hours The case I'm most familiar with has been investigated four different times, by different people and from different angles. Five separate reports have been written. At time of writing the situation has dragged out for three full years, and it's consumed over 100 hours of my time alone, and who knows how much time for the other like 30 people involved. You might think "holy shit, at that point who even cares, this is obviously not worth all those precious life hours that those 30 people will never get back, just ban the guy." I'm inclined to agree, but unfortunately: Panels generally don't have much real ability to enforce things If the members of your community don't agree with your decision to ban someone, you can't force them to abide by your decision. Here are the actions available to you: Announce your decision to everyone in the community Ban the person from spaces that you personally have control over, which may include your home, events you are organizing, and online spaces like Discord servers, Google groups, etc. Make recommendations for the behavior of other people and institutions Apply vague social pressure in the hope of making people follow your recommendations Here are things you cannot do: Make people stop being friends with the person Make the person stop holding events in their own home or in public Panels act like they are courts of law In a court of law, you are presumed innocent unless and until you can definitively be proven guilty of a specific crime. But this is ...
undefined
Jan 16, 2024 • 32min

AF - Sparse Autoencoders Work on Attention Layer Outputs by Connor Kissane

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Sparse Autoencoders Work on Attention Layer Outputs, published by Connor Kissane on January 16, 2024 on The AI Alignment Forum. This post is the result of a 2 week research sprint project during the training phase of Neel Nanda's MATS stream. Executive Summary We replicate Anthropic's MLP Sparse Autoencoder (SAE) paper on attention outputs and it works well: the SAEs learn sparse, interpretable features, which gives us insight into what attention layers learn. We study the second attention layer of a two layer language model (with MLPs). Specifically, rather than training our SAE on attn_output, we train our SAE on "hook_z" concatenated over all attention heads (aka the mixed values aka the attention outputs before a linear map - see notation here). This is valuable as we can see how much of each feature's weights come from each head, which we believe is a promising direction to investigate attention head superposition, although we only briefly explore that in this work. We open source our SAE, you can use it via this Colab notebook . Shallow Dives: We do a shallow investigation to interpret each of the first 50 features. We estimate 82% of non-dead features in our SAE are interpretable (24% of the SAE features are dead). See this feature interface to browse the first 50 features. Deep dives: To verify our SAEs have learned something real, we zoom in on individual features for much more detailed investigations: the "'board' is next by induction" feature, the local context feature of "in questions starting with 'Which'", and the more global context feature of "in texts about pets". We go beyond the techniques from the Anthropic paper, and investigate the circuits used to compute the features from earlier components, including analysing composition with an MLP0 SAE. We also investigate how the features are used downstream, and whether it's via MLP1 or the direct connection to the logits. Automation: We automatically detect and quantify a large "{token} is next by induction" feature family. This represents ~5% of the living features in the SAE. Though the specific automation technique won't generalize to other feature families, this is notable, as if there are many "one feature per vocab token" families like this, we may need impractically wide SAEs for larger models. Introduction In Anthropic's SAE paper, they find that training sparse autoencoders (SAEs) on a one layer model's MLP activations finds interpretable features, providing a path to breakdown these high dimensional activations into units that we can understand. In this post, we demonstrate that the same technique works on attention layer outputs and learns sparse, interpretable features! To see how interpretable our SAE is we perform shallow investigations of the first 50 features of our SAE (i.e. randomly chosen features). We found that 76% are not dead (i.e. activate on at least some inputs), and within the alive features we think 82% are interpretable. To get a feel for the features we find see our interactive visualizations of the first 50. Here's one example:[1] Shallow investigations are limited and may be misleading or illusory, so we then do some deep dives to more deeply understand multiple individual features including: "'board' is next, by induction" - one of many "{token} is next by induction" features "In questions starting with 'Which'" - a local context feature, which interestingly is computed by multiple heads "In pet context" - one of many high level context features Similar to the Anthropic paper's "Detailed Investigations", we understand when these features activate and how they affect downstream computation. However, we also go beyond Anthropic's techniques, and look into the upstream circuits by which these features are computed from earlier components. An attention layer (with frozen att...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app