

The Nonlinear Library
The Nonlinear Fund
The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org
Episodes
Mentioned books

Nov 12, 2023 • 28min
LW - Don't Donate A Kidney To A Stranger by George3d6
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Don't Donate A Kidney To A Stranger, published by George3d6 on November 12, 2023 on LessWrong.
Donate a kidney to a stranger, is a battle cry picking up some fervor in EA circles. The stem of this seems to be certainty in a flawed understanding of medical research, combined with conviction about a rather acrobatic view of morality.
My argument against donation is split into 3 parts, with the first being by far the most important:
Why kidney donations do great harm to the donor
Why kidney donations are ethically fuzzy and might be a net negative
Why the desire to donate a kidney is likely misplaced
I Health impact
Summary
Kidneys are important, and having fewer of them leads to a severe downgrade in markers associated with health and quality of life. Donating a kidney results in an over 1300% increase in the risk of kidney disease. A risk-averse interpretation of the data puts the increase in year-to-year mortality after donation upwards of 240%.
While through a certain lens, you can claim kidney donation is not that big a deal, this perception stems mainly from comparing a (very healthy) donor population with your average American or European (prediabetic, overweight, almost never exercises, and classifies fruits as cake decoration as opposed to stand-alone food).
Furthermore, when research evidence is mixed due to the difficulty of the studied area, lack of data, and complete lack of open data, we should fall back to our theories about human physiology, as well as common sense, both of which paint a very bleak picture.
You should not donate a kidney if you aren't prepared to live the rest of your life with significantly decreased cognitive and physical capacity.
1.a Limitations of medical research
After more than 5 years of reading medical research as a hobby, the only thing I can conclude about it with certainty is that it's uniquely hard to do well. It sits at the intersection of:
Cutting through a very complicated part of nature that isn't amendable to the kind of experiments that yielded so much success in fields like physics and chemistry.
It is filled with actors that have misplaced motivations, and do not care about "correct" interpretations, nor about data quality (or outright fake data). Not because they are evil, but because getting "the right result" means a payoff in the billions of dollars.
Filled with actors that impinge upon doing science correctly under the guise of ethics and privacy. Often with no real effect on what a normal person would think of as ethical of private… but that's another topic.
The reason Kidney donation is considered "safe" is because. From a limited amount of epidemiological and observational studies, with follow-ups in the 2-30 years, there is, on average, no increase in mortality.
None of these studies were RCTs, and the sample size is quite low.
This amount and type of evidence would not be sufficient to approve a drug. The quality of these claims is about as good as the quality of claims one could make about a relatively niche diet.
There are two big generators of error here
A) Matching Controls
Which is to say, any study that looks at this will pick some controls based on factors like demographics, biomarkers, and, sometimes intent (i.e. people who wanted to donate a kidney to a family member but there wasn't a match). Being the kind of (naively?) good selfish person who would donate a kidney can correlate with a lot of positive outcomes.
B) Researcher and Publication Bias
You rely on the researchers to get the data analysis right, and you rely on whatever gets published being representative as opposed to cherry-picked.
As it stands, the data on which these studies are is usually not public, so you can't double-check the researchers here, you can't pick a different lens through which to analyze the data.
More importantly, t...

Nov 12, 2023 • 19min
EA - Kids or No Kids by KidsOrNoKids
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Kids or No Kids, published by KidsOrNoKids on November 12, 2023 on The Effective Altruism Forum.
This post summarizes how my partner and I decided whether to have children or not. We spent hundreds of hours on this decision and hope to save others part of that time. We found it very useful to read the thoughts of people who share significant parts of our values on the topic and thus want to "pay it forward" by writing this up.
In the end, we decided to have children; our son is four months old now and we're very happy with how we made the decision and with how our lives are now (through a combination of sheer luck and good planning). It was a very narrow and very tough decision though.
Both of us care a lot about having a positive impact on the world and our jobs are the main way we expect to have an impact (through direct work and/or earning to give). As a result, both of us are quite ambitious professionally; we moved multiple times for our jobs and work 50-60h weeks. I expect this write-up to be most useful for people for whom the same is true.
Bear in mind this is an incredibly loaded and very personal topic - some of our considerations may seem alienating or outrageous. Please note I am not at all trying to argue how anyone should make their life decisions! I just want to outline what worked well for us, so others may pick and choose to use part of that process and/or content for themselves.
Finally, please note that while many readers will know who I am and that is fine, I don't want this post to be findable when googling my name. Thus, I posted it under a new account and request that you don't use any personal references when commenting or mentioning it online.
Process - how we decided
We had many sessions together and separately, totaling hundreds of hours over the course of 2 years, on this decision and the research around it. My partner tracked 200 toggl hours, I estimate I spent a bit less time individually but our conversations come on top. In retrospect, it seems obvious, but it took me longer than I wish it would have to realize that this is important, very hard work, for which I needed high-quality, focused work time rather than the odd evening or lazy weekend.
We each made up our minds using roughly the considerations below - this took the bulk of the time. We then each framed our decision as "Yes/No if xyz", for instance, "Yes if I can work x hours in a typical week", and finally "negotiated" a plan under which we could agree on the conclusion "yes" or "no".
In this process, actually making a timetable of what a typical day would look like in 30-minute intervals was very useful. I'm rather agreeable, so I am likely to produce miscommunications of the sort "When you said "sometimes", I thought it meant more than one hour a day" - writing down what a typical day could look like helped us catch those. When hearing about this meticulous plan, many people told me that having kids would be a totally unpredictable adventure.
I found that not to be true - my predictions about what I would want, what would and wouldn't work, etc. largely held true so far. My suspicion is most people just don't try as hard as we did to make good predictions. A good amount of luck is of course also involved - we are blessed with a healthy, relatively calm and content baby so far. Both of us feel happier than predicted, if anything.
I came away from this process with a personal opinion: If it seems weird to spend hours deliberating and negotiating over an Excel sheet with your partner, consider how weird it is not to do that - you are making a decision that will cost you hundreds of thousands of dollars and is binding for years; if you made this type of decisions at work without running any numbers, you'd be out of a job and likely in court pretty quickly.
In our case, if you bu...

Nov 12, 2023 • 2min
EA - Webinar invitation: learn how to use Rethink Priorities' new prioritization tool by Rethink Priorities
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Webinar invitation: learn how to use Rethink Priorities' new prioritization tool, published by Rethink Priorities on November 12, 2023 on The Effective Altruism Forum.
What do your views imply about the relative cost-effectiveness of various causes?
With
Giving Tuesday coming up, it's worth tackling this question. Rethink Priorities' new
cross-cause cost-effectiveness model (CCM) might be able to help.
RP's
Worldview Investigations Team created the CCM as a part of its project on
Causes and uncertainty: Rethinking value in expectation.
About the virtual event
On November 28, the Worldview Investigations Team will lead a discussion that will encompass:
An explanation of why they created the CCM
A virtual walkthrough of the model itself
A practical workshop on how you can use the tool
A question-and-answer session
Attending from the Worldview Investigation Team will be: Philosophy Researcher
Derek Shiller, Executive Research Coordinator
Laura Duffy, and Senior Research Manager
Bob Fischer.
Come explore how different assumptions interact, and potentially make some surprising discoveries!
Details
The workshop will be held on November 28 at 9 am PT / noon ET / 5 pm BT / 6 pm CET.
If you're interested in attending (even if you think you can't make that particular time), please complete
this form. We will send you further details as we get closer to the event.
Rethink Priorities (RP) is a think-and-do tank that addresses global priorities by researching solutions and strategies, mobilizing resources, and empowering our team and others.
Rachel Norman and Henri Thunberg wrote this post.
If you are interested in RP's work, please visit our
research database and subscribe to our
newsletter.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Nov 12, 2023 • 8min
EA - How we work, #2: We look at specific opportunities, not just general interventions by GiveWell
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How we work, #2: We look at specific opportunities, not just general interventions, published by GiveWell on November 12, 2023 on The Effective Altruism Forum.
This post is the second in a multi-part series, covering how GiveWell works and what we fund. The first post, on cost-effectiveness, is here. Through these posts, we hope to give a better understanding of our research and decision-making.
Author: Isabel Arjmand
Looking forward, not just backward
When we consider recommending funding, we don't just want to know whether a program has generally been cost-effective in the past - we want to know how additional funding would be used.
People sometimes think of GiveWell as recommending entire programs or organizations. This was more accurate in GiveWell's early days, but now we tend to narrow in on specific opportunities. Rather than asking whether it is cost-effective to deliver long-lasting insecticide-treated nets in general, we ask more specific questions, such as whether it is cost-effective to fund net distributions in 2023 in Benue, Plateau, and Zamfara states, Nigeria, given the local burden of malaria and the costs of delivering nets in those states.
Geographic factors affecting cost-effectiveness
The same program can vary widely in cost-effectiveness across locations. The burden of a disease in a particular place is often a key factor in determining overall cost-effectiveness. All else equal, it's much more impactful to deliver vitamin A supplements in areas with high rates of vitamin A deficiency than in areas where almost everyone consumes sufficient vitamin A as part of their diet.
As another example, we estimate it costs roughly the same amount for the Against Malaria Foundation to deliver an insecticide-treated net in Chad as it does in Guinea (about $4 in both locations). But, we estimate that malaria-attributable deaths of young children in the absence of nets would be roughly 5 times higher in Guinea than in Chad (roughly 8.8 deaths per 1,000 per year versus roughly 1.7 per 1,000), which leads AMF's program to be much more cost-effective in Guinea.
This map from Our World in Data gives a sense of how deaths from malaria vary worldwide.[3]
Because cost-effectiveness varies with geography, we ask questions specific to the countries or regions where a program would take place. When we were investigating an opportunity to fund water chlorination in Malawi, for example, we wanted to know:
How does baseline mortality from poor water quality in Malawi compare with that in the regions where the key studies on water chlorination took place?
What is the overall morbidity burden from diarrhea in Malawi?
Might people be more or less likely to use chlorinated water in this area than in the areas where the key studies took place?
What does it cost to serve one person with in-line chlorination for one year? We calculate this, in part, by estimating how many people are served by each device.
What proportion of the population is under the age of five? This is important to our calculations because we think young children are disproportionately susceptible to death from diarrhea.
What is the baseline level of water treatment in the absence of this program?
Where relevant, we also consider implementation challenges caused by security concerns or other contextual factors.
Why do cost-effective funding gaps sometimes go unfilled?
People are often surprised that some high-impact funding gaps, like the ones GiveWell aims to fund, aren't already filled. Of course, many high-impact opportunities are already supported by other funders, like Gavi or the Global Fund, to name just a couple examples. When we see remaining gaps, we think about how our grant might affect other funders' decisions, and whether another funder would step in to fill a particular gap if we didn't.[4]
The...

Nov 12, 2023 • 9min
LW - It's OK to be biased towards humans by dr s
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: It's OK to be biased towards humans, published by dr s on November 12, 2023 on LessWrong.
Let's talk about art.
In the wake of AI art generators being released, it's become pretty clear this will have a seismic effect on the art industry all across - from illustrators, to comic artists, to animators, many categories see their livelihood threatened, with no obvious "higher level" opened by this wave of automation for them to move to. On top of this, the AI generators seem to have mostly been trained with material whose copyright status is... dubious, at the very least. Images have been scraped from the internet, frames have been taken from movies, and in general lots of stuff that would usually count as "pirated" if you or I just downloaded it for our private use has been thrown by the terabyte inside diffusion models that can now churn out endless variations on the styles and models they fitted over them.
On top of being a legal quandary, this issues border into the philosophical. Broadly speaking, one tends to see two interpretations:
the AI enthusiasts and companies tend to portray this process as "learning". AIs aren't really plagiarizing, they're merely using all that data to infer patterns, such as "what is an apple" or "what does Michelangelo's style look like". They can then apply those patterns to produce new works, but these are merely transformative remixes of the originals, akin to what any human artist does when drawing from their own creative inspirations and experiences.
the artists on the other hand respond that the AI is not learning in any way resembling what humans do, but is merely regurgitating minor variations on its training set materials, and as such it is not "creative" in any meaningful sense of the world - merely a way for corporations to whitewash mass-plagiarism and resell illegally acquired materials.
Now, both these arguments have their good points and their glaring flaws. If I was hard pressed to say what is it that I think AI models are really doing I would probably end up answering "neither of these two, but a secret third thing". They probably don't learn the way humans do. They probably do learn in some meaningful sense of the word, they seem too good at generalizing stuff for the idea of them being mere plagiarizers to be a defensible position.
I am similarly conflicted in matters of copyright. I am not a fan of our current copyright laws, which I think are far too strict, to the point of stifling rather than incentivizing creativity, but also, it is a very questionable double standard that after years of having to deal with DRM and restrictions imposed in an often losing war against piracy now I simply have to accept that a big enough company can build a billion dollars business from terabytes of illegally scraped material.
None of these things, however, I believe, cut at the heart of the problem. Even if modern AIs were not sophisticated enough to "truly" learn from art, future ones could be. Even if modern AIs have been trained on material that was not lawfully acquired, future ones could be. And I doubt that artists would then feel OK with said AIs replacing them, now that all philosophical and legal technicalities are satisfied; their true beef cuts far deeper than that.
Observe how the two arguments above go, stripped to their essence:
AIs have some property that is "human-like", therefore, they must be treated exactly as humans;
AIs should not be treated as humans because they lack any "human-like" property.
The thing to note is that argument 1 (A, hence B) sets the tone; argument 2 then strives to refuse its premise so that it can deny the conclusion (Not A, hence Not B), but it accepts and in fact reinforces the unspoken assumption that having human-like properties means you get to be treated as a human.
I suggest an alter...

Nov 11, 2023 • 15min
EA - A robust earning to give ecosystem is better for EA by abrahamrowe
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A robust earning to give ecosystem is better for EA, published by abrahamrowe on November 11, 2023 on The Effective Altruism Forum.
(Written in a personal capacity, and not representing either my current employer or former one)
In 2016, I founded Utility Farm, and later merged it with Wild-Animal Suffering Research (founded by Persis Eskander) to form
Wild Animal Initiative. Wild Animal Initiative is, by my estimation, a highly successful research organization.
The current Wild Animal Initiative staff deserve all the credit for where they have taken the organization, but I'm incredibly proud that I got to be involved early in the establishment of a new field of study, wild animal welfare science, and to see the tiny organization I started in an apartment with a few hundred dollars go on to be recommended by ACE as a Top Charity for 4 years in a row. In my opinion, Wild Animal Initiative has become, under the stewardship of more capable people than I, the single best bet for unlocking interventions that could tackle the vast majority of animal suffering.
Unlike most EA charities today, Utility Farm didn't launch with a big grant from Open Philanthropy, Survival and Flourish Foundation, or EA Funds. There was no bet made by a single donor on a promising idea. I launched Utility Farm with my own money, which I spent directly on the project. I was making around $35,000 a year at the time working at a nonprofit, and spending maybe $300 a month on the project.
Then one day, a donor completely changed the trajectory of the organization by giving us around $500. It's weird looking at that event through the lens of current EA funding levels - it was a tiny bet, but it took the organization from being a side project that was cash-strapped and completely reliant on my energy and time to an organization that could actually purchase some supplies or hire a contractor for a project.
From there, a few more donors gave us a few thousand dollars each. These funds weren't enough to hire staff or do anything substantial, but they provided a lifeline for the organization, allowing us to run our first real research projects and to publish our work online.
In 2018, we ran our first major fundraiser. We received several donations of a few thousand dollars, and (if I recall correctly) one gift of $20,000. Soon after,
EA Funds granted us $40,000. We could then hire staff for the first time, and make tangible progress toward our mission.
As small as these funds were in the scheme of things, for Utility Farm, they felt sustainable. We didn't have one donor - we had a solid base of maybe 50 supporters, and no single individual dominated our funding. Our largest donor changing their mind about our work would have been a major disappointment, but not a financial catastrophe. Fundraising was still fairly easy - we weren't trying to convince thousands of people to give $25.
Instead, fundraising consisted of checking in with a few dozen people, sending some emails, and holding some calls. Most of the "fundraising" was the organization doing impactful work, not endless donor engagement.
I now work at a much larger EA organization with around 100x the revenue and 30x the staff. Oddly, we don't have that many more donors than Utility Farm did back then - maybe around 2-4 times as many small donors, and about the same number giving more than $1,000.
This probably varies between organizations - I have a feeling that many organizations doing more direct work than Rethink Priorities have many more donors - but most EA organizations seem to have strikingly few mid-sized donors (e.g., individuals who give maybe $1,000 - $25,000).
Often, organizations will have a large cohort of small donors, giving maybe $25-$100, and then they'll have 2-3 (or even just 1) giant donors, collectively giving 95%+ of the organi...

Nov 11, 2023 • 14min
EA - Who is Sam Bankman-Fried (SBF) really, and how could he have done what he did? - three theories and a lot of evidence by spencerg
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Who is Sam Bankman-Fried (SBF) really, and how could he have done what he did? - three theories and a lot of evidence, published by spencerg on November 11, 2023 on The Effective Altruism Forum.
As you may know, Sam Bankman-Fried ("SBF") was convicted of seven counts of fraud and conspiracy. He now faces the potential of more than 100 years in prison.
I've been trying to figure out how someone who appears to believe deeply in the principles of effective altruism could do what SBF did. It has been no surprise to me to see that the actions he was convicted of are nearly universally condemned by the EA community. Could it be that he did not actually believe in EA ideas despite promoting EA and claiming to believe in it? If he does believe in EA principles, there seems to be a genuine mystery here as to why he took those actions.
There are a few theories that could potentially explain the seeming mystery. In this post, I'll discuss the strongest evidence I've been able to find for and against each of the three theories that I find most plausible.
It seems important to me to seek an understanding of the deeper causes of this disaster to help prevent future such disasters. It also seems to me to be essential for the EA community, in particular, to understand why this happened. An understanding of the disaster and the person behind it might be necessary (though probably not sufficient) for the community to prevent similar events from happening in the future.
A few important things before we begin the analysis
In this piece, I assume that SBF committed all the crimes that he was convicted of. If it somehow turns out that SBF isn't guilty of these crimes, then some parts of this post would not apply (and you should consider most of this post withdrawn).
It's also important to note that the opinions I express in this post are, for the most part, informed by studying publicly available details about SBF and the FTX collapse, as well as confidential conversations I've had with a number of different people who knew SBF (some who worked with him, some who knew him as a friend).
I promised confidentiality to these people to help them be more comfortable sharing information honestly with me, so I won't use their names or other indications of how they know him. I shared this post with them prior to publishing it to help reduce the chance that I introduced errors in what they said to me.
I've also pulled quotes from the new book about SBF, Going Infinite, and from podcast interviews with its author, Michael Lewis. Lewis spent a lot of time with SBF (starting from late April 2022 and continuing into SBF's period under house arrest), so he had a lot of time to form impressions of him.
I also had some interactions with SBF myself, which I discuss in more detail in my podcast episode about the FTX disaster. The podcast episode is a good place to start if you are fuzzy on the basic facts of what happened during the FTX disaster and want to know more. I also recorded an earlier podcast episode with SBF about crypto tech (prior to accusations of wrongdoing against him), but it doesn't provide much information relevant to the topic of this post. My first-hand experience with him was limited; it informs my viewpoint on him much less than other evidence I've collected.
I am very interested in hearing your own arguments or evidence with regard to which theory you think is most likely about the FTX calamity (whether it is one of the three outlined below or another theory altogether).
Defining DAE
Throughout this post, I'll use the term DAE ("deficient affective experience") to refer to anyone who has at least one of these two traits:
Little or no ability or tendency to experience affective (i.e., emotional) empathy in response to someone else's suffering
Little or no ability or tendency to experi...

Nov 11, 2023 • 13min
EA - Memo on some neglected topics by Lukas Finnveden
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Memo on some neglected topics, published by Lukas Finnveden on November 11, 2023 on The Effective Altruism Forum.
I originally wrote this for the
Meta Coordination Forum. The organizers were interested in a memo on topics other than alignment that might be increasingly important as AI capabilities rapidly grow - in order to inform the degree to which community-building resources should go towards AI safety community building vs. broader capacity building. This is a lightly edited version of my memo on that. All views are my own.
Some example neglected topics (without much elaboration)
Here are a few example topics that could matter a lot if we're in
the most important century, which aren't always captured in a normal "AI alignment" narrative:
The potential moral value of AI. [1]
The potential importance of making AI behave cooperatively towards humans, other AIs, or other civilizations (whether it ends up intent-aligned or not).
Questions about how human governance institutions will keep up if AI leads to explosive growth.
Ways in which AI could cause human deliberation to get derailed, e.g. powerful persuasion abilities.
Positive visions about how we could end up on a good path towards becoming a society that makes wise and kind decisions about what to do with the resources accessible to us. (Including how AI could help with this.)
(More elaboration on these
below.)
Here are a few examples of somewhat-more-concrete things that it might (or might not) be good for some people to do on these (and related) topics:
Develop proposals for how labs could treat digital minds better, and advocate for them to be implemented. (C.f.
this nearcasted proposal.)
Advocate for people to try to avoid building AIs with large-scale preferences about the world (at least until we better understand what we're doing). In order to avoid a scenario where, if some generation of AIs turn out to be sentient and worthy of rights, we're forced to choose between "freely hand over political power to alien preferences" and "deny rights to AIs on no reasonable basis".
Differentially accelerate AI being used to improve our ability to find the truth, compared to being used for propaganda and manipulation.
E.g.: Start an organization that uses LLMs to produce epistemically rigorous investigations of many topics. If you're the first to do a great job of this, and if you're truth-seeking and even-handed, then you might become a trusted source on controversial topics. And your investigations would just get better as AI got better.
E.g.: Evaluate and write-up facts about current LLM's forecasting ability, to incentivize labs to make LLMs state correct and calibrated beliefs about the world.
E.g.: Improve
AI ability to help with thorny philosophical problems.
Implications for community building?
…with a focus on "the extent to which community-building resources should go towards AI safety vs. broader capacity building".
Ethics, philosophy, and prioritization matter more for research on these topics than it does for alignment research.
For some issues in AI alignment, there's a lot of convergence on what's important regardless of your ethical perspective, which means that ethics & philosophy aren't that important for getting people to contribute. By contrast, when thinking about "everything but alignment", I think we should expect somewhat more divergence, which could raise the importance of those subjects.
For example:
How much to care about digital minds?
How much to focus on "deliberation could get off track forever" (which is of great longtermist importance) vs. short-term events (e.g. the speed at which AI gets deployed to solve all of the world's current problems.)
But to be clear, I wouldn't want to go hard on any one ethical framework here (e.g. just utilitarianism). Some diversity and pluralism seems ...

Nov 11, 2023 • 5min
LW - Palisade is hiring Research Engineers by Charlie Rogers-Smith
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Palisade is hiring Research Engineers, published by Charlie Rogers-Smith on November 11, 2023 on LessWrong.
Palisade is looking to hire Research Engineers. We are a small team consisting of Jeffrey Ladish (Executive Director), Charlie Rogers-Smith (Chief of Staff), and Kyle Scott (part-time Treasurer & Operations). In joining Palisade, you would be a founding member of the team, and would have substantial influence over our strategic direction. Applications are rolling, and you can fill out our short (~10-20 minutes) application form
here.
Palisade's mission
We research dangerous AI capabilities to better understand misuse risks from current systems, and how advances in hacking, deception, and persuasion will affect the risk of catastrophic AI outcomes. We create concrete demonstrations of dangerous capabilities to advise policy makers and the public on AI risks.
We are working closely with government agencies, policy think tanks, and media organizations to inform relevant decision makers. For example,
our
work demonstrating that it is possible to effectively undo Llama 2-Chat 70B's safety fine-tuning for less than $200 has been used to
confront Mark Zuckerburg in the first of Chuck Schumer's Insight Forums, cited by Senator Hassan in a
senate hearing on threats to national security, and used to advise the UK AI Safety Institute.
We plan to study dangerous capabilities in both open source and API-gated models in the following areas:
Automated hacking. Current AI systems can already automate parts of the
cyber kill chain. We've demonstrated that GPT-4 can leverage known vulnerabilities to achieve remote code execution on unpatched Windows 7 machines. We plan to explore how AI systems could conduct reconnaissance, compromise target systems, and use information from compromised systems to pivot laterally through corporate networks or carry out social engineering attacks.
Spear phishing and deception. Preliminary
research suggests that LLMs can be effectively used to phish targets. We're currently exploring how well AI systems can scrape personal information and leverage it to craft scalable spear-phishing campaigns. We also plan to study how well conversational AI systems could build rapport with targets to convince them to reveal information or take actions contrary to their interests.
Scalable disinformation. Researchers have
begun to explore how LLMs can be used to create targeted disinformation campaigns at scale. We've demonstrated to policymakers how a combination of text, voice, and image generation models can be used to create a fake reputation-smearing campaign against a target journalist. We plan to study the cost, scalability, and effectiveness of AI-disinformation systems.
We are looking for
People who excel at:
Working with language models. We're looking for somebody who is or could quickly become very skilled at working with frontier language models. This includes supervised fine-tuning, using reward models/functions (RLHF/RLAIF), building scaffolding (e.g. in the style of AutoGPT), and prompt engineering / jailbreaking.
Software engineering. Alongside working with LMs, much of the work you do will benefit from a strong foundation in software engineering - such as when designing APIs, working with training data, or doing front-end development. Moreover, strong SWE experience will help getting up to speed with working with LMs, hacking, or new areas we want to pivot to.
Technical communication. By writing papers, blog posts, and internal documents; and by speaking with the team and external collaborators about your research.
While it's advantageous to excel at all three of these skills, we will strongly consider people who are either great at working with language models or at software engineering, while being able to communicate their work well.
Competenci...

Nov 11, 2023 • 4min
AF - Open Phil releases RFPs on LLM Benchmarks and Forecasting by Lawrence Chan
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Open Phil releases RFPs on LLM Benchmarks and Forecasting, published by Lawrence Chan on November 11, 2023 on The AI Alignment Forum.
As linked at the top of Ajeya's "do our RFPs accelerate LLM capabilities" post, Open Philanthropy (OP) recently released two requests for proposals (RFPs):
An RFP on LLM agent benchmarks: how do we accurately measure the real-world, impactful capabilities of LLM agents?
An RFP on forecasting the real world-impacts of LLMs: how can we understand and predict the broader real-world impacts of LLMs?
Note that the first RFP is both significantly more detailed and has narrower scope than the second one, and OP recommends you apply for the LLM benchmark RFP if your project may be a fit for both.
Brief details for each RFP below, though please read the RFPs for yourself if you plan to apply.
Benchmarking LLM agents on consequential real-world tasks
Link to RFP: https://www.openphilanthropy.org/rfp-llm-benchmarks
We want to fund benchmarks that allow researchers starting from very different places to come to much greater agreement about whether extreme capabilities and risks are plausible in the near-term. If LLM agents score highly on these benchmarks, a skeptical expert should hopefully become much more open to the possibility that they could soon automate large swathes of important professions and/or pose catastrophic risks. And conversely, if they score poorly, an expert who is highly concerned about imminent catastrophic risk should hopefully reduce their level of concern for the time being.
In particular, they're looking for benchmarks with the following three desiderata:
Construct validity: the benchmark accurately captures a potential real-world, impactful capability of LLM agents.
Consequential tasks: the benchmark features tasks that will have massive economic impact or can pose massive risks.
Continuous scale: the benchmark improves relatively smoothly as LLM agents improve (that is, they don't go from ~0% performance to >90% like many existing LLM benchmarks have).
Also, OP will do a virtual Q&A session for this RFP:
We will also be hosting a 90-minute webinar to answer questions about this RFP on Wednesday, November 29 at 10 AM Pacific / 1 PM Eastern (link to come).
Studying and forecasting the real-world impacts of systems built from LLMs
Link to RFP: https://www.openphilanthropy.org/rfp-llm-impacts/
This RFP is significantly less detailed, and primarily consists of a list of projects that OP may be willing to fund:
To this end, in addition to our
request for proposals to create benchmarks for LLM agents, we are also seeking proposals for a wide variety of research projects which might shed light on what real-world impacts LLM systems could have over the next few years.
Here's the full list of projects they think could make a strong proposal:
Conducting randomized controlled trials to measure the extent to which access to LLM products can increase human productivity on real-world tasks. For example:
Polling members of the public about whether and how much they use LLM products, what tasks they use them for, and how useful they find them to be.
In-depth interviews with people working on deploying LLM agents in the real world.
Collecting "in the wild" case studies of LLM use, for example by scraping Reddit (e.g.
r/chatGPT), asking people to submit case studies to a dedicated database, or even partnering with a company to systematically collect examples from consenting customers.
Estimating and collecting key numbers into one convenient place to support analysis.
Creating interactive experiences that allow people to directly make and test their guesses about what LLMs can do.
Eliciting expert forecasts about what LLM systems are likely to be able to do in the near future and what risks they might pose.
Synthesizing, summarizing, and ...


