

The Nonlinear Library
The Nonlinear Fund
The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org
Episodes
Mentioned books

Nov 16, 2023 • 3min
AF - Evaluating AI Systems for Moral Status Using Self-Reports by Ethan Perez
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Evaluating AI Systems for Moral Status Using Self-Reports, published by Ethan Perez on November 16, 2023 on The AI Alignment Forum.
TLDR: In a new paper, we explore whether we could train future LLMs to accurately answer questions about themselves. If this works, LLM self-reports may help us test them for morally relevant states like consciousness.
We think it's possible to start preliminary experiments testing for moral status in language models now, so if you're interested in working with us, please reach out and/or apply to the Astra fellowship or SERI MATS (deadline November 17).
Tweet thread paper summary: Link
Abstract:
As AI systems become more advanced and widely deployed, there will likely be increasing debate over whether AI systems could have conscious experiences, desires, or other states of potential moral significance. It is important to inform these discussions with empirical evidence to the extent possible.
We argue that under the right circumstances, self-reports, or an AI system's statements about its own internal states, could provide an avenue for investigating whether AI systems have states of moral significance. Self-reports are the main way such states are assessed in humans ("Are you in pain?"), but self-reports from current systems like large language models are spurious for many reasons (e.g. often just reflecting what humans would say).
To make self-reports more appropriate for this purpose, we propose to train models to answer many kinds of questions about themselves with known answers, while avoiding or limiting training incentives that bias self-reports. The hope of this approach is that models will develop introspection-like capabilities, and that these capabilities will generalize to questions about states of moral significance.
We then propose methods for assessing the extent to which these techniques have succeeded: evaluating self-report consistency across contexts and between similar models, measuring the confidence and resilience of models' self-reports, and using interpretability to corroborate self-reports. We also discuss challenges for our approach, from philosophical difficulties in interpreting self-reports to technical reasons why our proposal might fail. We hope our discussion inspires philosophers and AI researchers to criticize and improve our proposed methodology, as well as to run experiments to test whether self-reports can be made reliable enough to provide information about states of moral significance.
See also this earlier post for more discussion on the relevance to alignment (AI systems that are suffering might be more likely to take catastrophic actions), as well as some initial criticisms of a preliminary version of our proposed test in the comments. We've added a significant amount of content to our updated proposal to help to address several reservations people had on why the initial proposal might not work, so we're excited to get additional feedback and criticism on the latest version of our proposal as well.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

Nov 16, 2023 • 55min
LW - Social Dark Matter by [DEACTIVATED] Duncan Sabien
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Social Dark Matter, published by [DEACTIVATED] Duncan Sabien on November 16, 2023 on LessWrong.
You know it must be out there, but you mostly never see it.
Author's Note 1: In something like 75% of possible futures, this will be the last essay that I publish on LessWrong. Future content will be available on my substack, where I'm hoping people will be willing to chip in a little commensurate with the value of the writing, and (after a delay) on my personal site (not yet live). I decided to post this final essay here rather than silently switching over because many LessWrong readers would otherwise never find out that they could still get new Duncanthoughts elsewhere.
Author's Note 2: This essay is not intended to be revelatory. Instead, it's attempting to get the consequences of a few very obvious things lodged into your brain, such that they actually occur to you from time to time as opposed to occurring to you approximately never.
Most people could tell you that 17 + 26 = 43 after a few seconds of thought or figuring, and it would be silly to write an essay about 17 + 26 equaling 43 and pretend that it was somehow groundbreaking or non-obvious.
But! If the point was to get you to see the relationship between 17, 26, and 43 very, very clearly, and to remember it sufficiently well that you would reflexively think "43" any time you saw 17 and 26 together in the wild, it might be worth taking the time to go slowly and say a bunch of obvious things over and over until it started to stick.
Thanks to Karim Alaa for the concept title. If you seek tl;dr, read the outline on the left and then skip to IX.
I. #MeToo
In September of 2017, if you had asked men in the United States "what percentage of the women that you personally know have experienced sexual assault?" most of them would have likely said a fairly low number.
In October of 2017, the hashtag #MeToo went viral.
In November of 2017, if you had asked men in the United States "what percentage of the women that you personally know have experienced sexual assault?" most of them would have given a much higher number than before.
(It's difficult, for many people, to remember that they would have said a number that we now know to be outrageously low; by default most of us tend to project our present knowledge back onto our past selves. But the #MeToo movement was sufficiently recent, and the collective shock sufficiently well-documented, that we can, with a little bit of conscientious effort, resist the mass memory rewrite. Most of us were wrong. That's true even if you specifically were, in fact, right.)
Talking about sexual assault is not quite as taboo, in the United States, as it is in certain other cultures. There are places in the world where, if a woman is raped, she might well be murdered by her own family, or forcibly married off to the rapist, or any number of other horrible things, because the shame and stigma is so great that people will do almost anything to escape it.
(There are places in the world where, if a man is raped - what are you talking about? Men can't be raped!)
The U.S. is not quite that bad. But nevertheless, especially prior to October of 2017, sexual assault was still a thing that you Don't Ever Talk About At The Dinner Table, and Don't Bring Up At Work. It wasn't the sort of thing you spoke of in polite company
(or even in many cases with friends and confidants, because the subject is so charged and people are deeply uncomfortable with it and there are often entanglements when both parties know the perpetrator)
and since there was pressure to avoid discussing it, people tended not to discuss it.
(Like I said, a lot of this will be obvious.)
And because people didn't discuss it, a lot of people (especially though not always men) were genuinely shocked at just how common, prevalent, pervasive it ...

Nov 16, 2023 • 14min
LW - In Defense of Parselmouths by Screwtape
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: In Defense of Parselmouths, published by Screwtape on November 16, 2023 on LessWrong.
Prerequisites: The Quaker and the Parselmouth.
I.
First, a quick summary.
In the prerequisite post, Benjamin Hoffman describes three kinds of people. These people are hypothetical extremes: they're the social and epistemic equivalents of perfect spheres interacting in frictionless vacuums. There are Quakers, who always tell the truth and keep their word when they say they'll do something. There are Actors, who always say what seems good to say at the moment and who don't reliably keep their word even if they swear and oath. Lastly, there are Parselmouths, who can lie freely to Actors but speak only the truth to other Parselmouths and (by implication) speak only truth to Quakers.
I approve of this distinction. It is abstracted and the real world is never this clear, but in my experience it does get at something useful to understand. I think truthtelling is a powerful institutional advantage, and wish more people were Quakers in this dichotomy. Benjamin points out that Parselmouths are somewhat odd, in that habitually telling lies likely erodes the instinct or maybe even ability to tell the truth; it may not be possible for real people to stay consistently Parselmouths without slowly becoming Actors.
Speaking truth is hard. It's hard work to figure out what the true state of the world is. It's hard to quickly and accurately state what you think is true; the English language makes "I believe there's a ninety percent chance of rain tomorrow" a much longer sentence than "it's going to rain tomorrow." There's a lot of extra emotional sharp elbows you wind up throwing when someone asks you how you liked the (burned and unseasoned) casserole they brought to the potluck.
Quakers of the world, I salute you. Actors of the world, I get it.
My first claim is that it's reasonable to be a Parselmouth.
II.
Storytime! The following story details events that happened about two decades ago, when I was several feet shorter than I am now. Some details have been substantiated by other people who were around at the time, but many likely have morphed over the years.
When I was a kid, I had to get a bunch of shots. My mom took me into the office, and I goofed around in waiting area for a little bit before a nurse waved me past the front desk and Mom and I went in. The nurse sat me down in the doctor's office on a big plastic chair and rubbed my shoulder with something cold while asking my mother questions, then she asked me to sit still for a moment and said "This won't hurt a bit. Are you ready?" I nodded. Then she stabbed me with a needle.
It hurt. I started crying, and continued crying for some time, well after the pain had faded to a dull ache. No amount of consoling from my parents or treats from the nurse changed this. I did not have the ability to articulate what made me upset then, but it was not the pain (even as a child, I had a remarkably high tolerance for pain when it had a purpose) but at confusion. It wasn't supposed to hurt- were they wrong about whether it would hurt? That didn't make sense, sticking a sharp thing into someone usually hurt them, why would someone think it wouldn't? Did I misremember what they said, and they said it would hurt instead of that it wouldn't? Is my memory really that fallible? I was utterly confused, and couldn't make sense of what happened.
With the benefit of years experience, it's obvious what happened. The nurse lied to keep a small child still while giving them a shot. This story would repeat itself for years, and I would be bewildered and confused each time. The hypothesis that someone would simply lie would not occur to me until much later, after an epiphany on how the world regarded truth.
While painful, that understanding turned out to be a useful skele...

Nov 16, 2023 • 37min
LW - 'Theories of Values' and 'Theories of Agents': confusions, musings and desiderata by Mateusz Bagiński
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: 'Theories of Values' and 'Theories of Agents': confusions, musings and desiderata, published by Mateusz Bagiński on November 16, 2023 on LessWrong.
Meta:
Content signposts: we talk about limits to expected utility theory; what values are (and ways in which we're confused about what values are); the need for a "generative"/developmental logic of agents (and their values); types of constraints on the "shape" of agents; relationships to FEP/active inference; and (ir)rational/(il)legitimate value change.
Context: we're basically just chatting about topics of mutual interests, so the conversation is relatively free-wheeling and includes a decent amount of "creative speculation".
Epistemic status: involves a bunch of "creative speculation" that we don't think is true at face value and which may or may not turn out to be useful for making progress on deconfusing our understanding of the respective territory.
Expected utility theory (stated in terms of the VNM axioms or something equivalent) thinks of rational agents as composed of two "parts", i.e., beliefs and preferences. Beliefs are expressed in terms of probabilities that are being updated in the process of learning (e.g., Bayesian updating). Preferences can be expressed as an ordering over alternative states of the world or outcomes or something similar. If we assume an agent's set of preferences to satisfy the four VNM axioms (or some equivalent desiderata), then those preferences can be expressed with some real-valued utility function u and the agent will behave as if they were maximizing that u.
On this account, beliefs change in response to evidence, whereas values/preferences in most cases don't. Rational behavior comes down to (behaving as if one is) ~maximizing one's preference satisfaction/expected utility. Most changes to one's preferences are detrimental to their satisfaction, so rational agents should want to keep their preferences unchanged (i.e., utility function preservation is an instrumentally convergent goal).
Thus, for a preference modification to be rational, it would have to result in higher expected utility than leaving the preferences unchanged. My impression is that the most often discussed setup where this is the case involves interactions between two or more agents. For example, if you and and some other agent have somewhat conflicting preferences, you may go on a compromise where each one of you makes them preferences somewhat more similar to the preferences of the other. This costs both of you a bit of (expected subjective) utility, but less than you would lose (in expectation) if you engaged in destructive conflict.
Another scenario justifying modification of one's preferences is when you realize the world is different than you expected on your priors, such that you need to abandon the old ontology and/or readjust it. If your preferences were defined in terms of (or strongly entangled with) concepts from the previous ontology, then you will also need to refactor your preferences.
You think that this is a confused way to think about rationality. For example, you see self-induced/voluntary value change as something that in some cases is legitimate/rational.
I'd like to elicit some of your thoughts about value change in humans. What makes a specific case of value change (il)legitimate? How is that tied to the concepts of rationality, agency, etc? Once we're done with that, we can talk more generally about arguments for why the values of an agent/system should not be fixed.
Sounds good?
On a meta note: I've been using the words "preference" and "value" more or less interchangeably, without giving much thought to it. Do you view them as interchangeable or would you rather first make some conceptual/terminological clarification?
Sounds great!
(And I'm happy to use "preferences" and "values" interc...

Nov 16, 2023 • 2min
EA - Economics of Animal Welfare: Call for Abstracts by Bob Fischer
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Economics of Animal Welfare: Call for Abstracts, published by Bob Fischer on November 16, 2023 on The Effective Altruism Forum.
Brown University's Department of Economics and Center for Philosophy, Politics, and Economics are hosting an interdisciplinary conference on the economics of animal welfare on July 11-12, 2024.
This conference aims to build on successful workshops on this topic at Duke University, Stanford University, and the Paris School of Economics. We welcome submissions on a range of topics that apply economic methods to understand how to value or improve animal welfare. This includes theoretical work on including losses or benefits to animals in economic analyses, applied empirical work on the effects of policies or industry structure on animal welfare, and anything else within the purview of economics as it relates to the well-being of commodity, companion, or wild animals.
We invite 300-word abstracts from economists and those in relevant fields, including animal welfare science, political science, and philosophy. In addition to full presentations, we also welcome "ideas in development" from graduate students or early-stage researchers that can be presented in less than 10 minutes.
Please submit abstracts and ideas-in-progress by January 15, 2024 via
this form. General attendance registration will open in January 2024.
Travel support to Providence will be provided for all accepted speakers.
A limited number of travel bursaries are available for graduate students and predoctoral researchers to attend without presenting a paper. Please apply for non-speaker travel funding in the link above.
Vegan meals will be provided.
While this is an in-person event, a limited number of remote presentations may be possible.
ORGANIZED BY:
Bob Fischer, Department of Philosophy, Texas State University
Anya Marchenko, Department of Economics, Brown University
Kevin Kuruc, Population Wellbeing Initiative, University of Texas at Austin
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Nov 16, 2023 • 3min
LW - Extrapolating from Five Words by Gordon Seidoh Worley
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Extrapolating from Five Words, published by Gordon Seidoh Worley on November 16, 2023 on LessWrong.
If you only get about five words to convey an idea, what will someone extrapolate from those five words? Rather than guess, you can use LLMs to experimentally discover what people are likely think those five words mean. You can use this to iterate on what five words you want to say in order to best convey your intended meaning.
I got this idea because I tried asking Claude to summarize an article at a link. Claude doesn't follow links, so it instead hallucinated a summary from the title, which was included in the URL path. Here's an example of it doing this with one of my LessWrong posts:
It hallucinates some wrong details and leaves out lots of details that are actually in the post, but it's not totally off the mark here. If my ~Five Words were "the problem of the criterion matters", this would be a reasonable extrapolation of why I would say that.
Rather than using a link, I can also ask Claude to come up what it thinks I would have put in a post with a particular title:
Strangely it does worse here in some ways and better in others. Unlike when it hallucinated the summary of the link, this time it came up with things I would absolutely not say or want someone to come away with, like the idea that we could resolve the problem of the criterion enough to have objective criteria for knowledge.
But maybe prompting it about LessWrong was the issue, since LessWrong puts off a lot of positivists vibes, Eliezer's claims to the contrary not withstanding. So I tried a different prompt:
This is fine? It's not great. It sounds like a summary of the kind of essay a bored philosophy undergrad would write for their epistemology class.
Let me try asking it some version of "what do my ~Five Words mean?":
This is pretty good, and basically what I would expect someone to take away from me saying "the problem of the criterion matters". Let's see what happens if I tweak the language:
Neat! It's picked up on a lot of nuance implied by saying "important" rather than "matters". This would be useful for trying out different variations on a phrase to see what those small variations change about the implied meaning. I could see this being useful for tasks like word smithing company values and missions and other short phrases where each word has to carry a lot of meaning.
Now let's see if it can do the task in reverse!
Honestly, "uncertainty undermines knowledge" might be better than anything I've ever come up with. Thanks, Claude!
As a final check, can Claude extrapolate from its own summary?
Clearly it's lost some of the details, particularly about the problem of the criterion, and has made up some things I wasn't trying to have it get at. Seems par for the course in terms of condensing down a nuanced message into about five words and still having the core of the message conveyed.
Okay, final test, what can Claude extrapolate from typical statements I might make about my favorite topic, fundamental uncertainty?
Hmm, okay, but not great. Maybe I should try to find another phrase to point to my ideas? Let's see what it thinks about "fundamental uncertainty" as a book title:
Close enough. I probably don't need to retitle my book, but I might need to work on a good subtitle.
Based on the above experiment in prompt engineering, Claude is reasonably helpful at iterating on summaries of short phrases. It was able to pick up on subtle nuance, and that's really useful for finding the right short phrase to convey a big idea. The next time I need to construct a short phrase to convey a complex idea, I will likely iterate the wording using Claude or another LLM.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Nov 16, 2023 • 5min
EA - How we approach charity staff pay and benefits at Giving What We Can by Luke Freeman
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How we approach charity staff pay and benefits at Giving What We Can, published by Luke Freeman on November 16, 2023 on The Effective Altruism Forum.
As an international charity with a talented global team, one challenging decision we face is how to pay our team members and provide benefits ("remuneration"). We grapple with several key questions:
What's ethical?
What's fair?
What's expected?
What would funders approve of?
How do we attract and retain high-quality talent while maintaining a focus on our own cost-effectiveness?
These questions become even more challenging within the nonprofit sector, where perspectives on pay are incredibly varied. Yet, it's crucial we discuss this openly, as staff remuneration often represents one of the most significant expenditures for an organisation.
Our ethos
In line with our mission to create a culture of effective and significant giving, we believe it's a reasonable expectation that our team members earn a salary that would enable them to comfortably donate 10% of their income, should they choose to.
Working at GWWC should not necessitate undue financial sacrifice, nor should it be primarily motivated by financial gain. Rather, we seek to attract individuals who are both highly skilled and deeply committed to effective giving. If someone's primary motivation leans toward earning potential, we would wholeheartedly encourage them to explore 'earning-to-give' opportunities instead.
How our pay calculator works
So, how does this ethos translate into actual numbers? We have built a calculator that incorporates the following:
We use a salary band system where our second band (e.g. a junior associate-level role) starts with base salary which is pegged to the average income in Oxford.
With each promotion to a new level (within or between bands) the base pay increases by 10%.
Depending on the person's location, we adjust 50% of the base salary by relative cost-of-living as a starting point, and make ~annual adjustments to account for factors like inflation and location-based cost-of-living changes.
We adjust upwards for experience (500 GBP per pre-GWWC relevance-adjusted FTE year and 1,000 per year at GWWC) with a cap of 10,000 GBP.
We have a scaling "competitive skills bonus" for a few roles (e.g., software engineering) that are typically very highly compensated by current markets and therefore difficult to hire for in our context.
We recalculate each staff member's remuneration annually and after any significant change in their role or location.
It's not perfect, but we feel it's a good start that strikes a balance between vastly different potential approaches. We hope that by sharing it and receiving critiques, we can continue to make adjustments in consultation with our team and our funders.
Results
The pay calculator tends to result in salaries that are higher than at most non-profits but below what a similar role would pay at a for-profit, and often well below what someone with high earning potential could make if they were choosing a career with an eye to earning as much as possible. It also gives lower increases with seniority than are common in the for-profit world resulting in a lower pay ratio from the highest paid to lowest paid employees. The financial sacrifice/incentive for working at GWWC does vary depending on your location, but we strive to make it reasonable and to find a good balance.
Benefits
Benefits are another critical aspect of our remuneration package. It can be challenging to harmonise benefits like retirement contributions, healthcare, childcare, training, parental leave, and office equipment across different locations, but we make a concerted effort to offer balanced packages for staff.
Offer letter
In our offer letter we share with the prospective team member their salary calculation and outline the benefit...

Nov 16, 2023 • 8min
EA - Announcing Giving Green's 2023 Top Climate Nonprofit Recommendations by Giving Green
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Announcing Giving Green's 2023 Top Climate Nonprofit Recommendations, published by Giving Green on November 16, 2023 on The Effective Altruism Forum.
What is Giving Green?
Giving Green is an EA-affiliated charity evaluator that helps donors direct funds to the highest-impact organizations looking to mitigate climate change. We believe that individuals can make a real impact by reshaping the laws, norms, and systems that perpetuate unsustainable emissions. Our annual list of recommendations helps direct donors towards high-impact climate nonprofits advocating for systemic change.
How does Giving Green work?
We spent the past year finding timely giving strategies that have a huge potential impact but are relatively neglected by traditional climate funding.
Our process starts by
assessing various impact strategies and narrowing in on ones that we believed could substantially reduce emissions, were feasible, and needed more funding (Figure 1). After developing a short list of impact areas, we explored the ecosystem of nonprofits operating in each space by speaking directly with organizations and other stakeholders. We used our findings to evaluate each organization's theory of change and its capacity to absorb additional funding.
For more information, see
Giving Green's Research Process.
Figure 1: Giving Green's process for identifying and assessing nonprofits
What climate nonprofits does Giving Green recommend for 2023?
Our findings led us to double down on one pathway where we believe climate donors can have an outsized impact:
Advancing key climate technologies through policy advocacy, research, and market support.
We think technological progress provides a uniquely powerful and feasible way to decarbonize, allowing the green transition to proceed while minimizing costs to quality of life and the economy.
For 2023, we highlight five key sectors ripe for innovation: next-generation geothermal energy, advanced nuclear, alternative protein innovation, industrial decarbonization, and shipping and aviation decarbonization; within those, we recommend six top climate charities (Figure 2).
Figure 2: Giving Green's 2023 top climate nonprofit recommendations
Below, you will find a brief overview of Giving Green's recommendations in reverse alphabetical order.
Project InnerSpace
Deep underground, the Earth's crust holds abundant heat that can supply renewable, carbon-free heat and reliable, on-demand electricity. However, conventional geothermal systems have been limited to places bordering the Earth's tectonic plates.
Project InnerSpace is fast-tracking next-generation technologies that can make geothermal energy available worldwide. It has a bold plan to reduce financial risks for new geothermal projects, making geothermal energy cheaper and more accessible, especially in densely populated areas in the Global South.
We believe Project InnerSpace is a top player in the geothermal sector and that its emphasis on fast technology development and cost reduction can help geothermal expand globally.
For more information, see our
Project InnerSpace recommendation summary.
Opportunity Green
Aviation and maritime shipping are challenging sectors to decarbonize and have not received much support from philanthropy in the past.
Opportunity Green has a multi-pronged strategy for reducing emissions from aviation and maritime shipping. It pushes for ambitious regulations, promotes clean fuels, encourages companies to adopt greener fleets, and works to reduce demand for air travel.
We think Opportunity Green has a strong theory of change that covers multiple ways to make a difference. We are especially excited about Opportunity Green's efforts to elevate climate-vulnerable countries in policy discussions, as we think this could improve the inclusivity of the process and the ambition level of...

Nov 16, 2023 • 42sec
EA - TED talk on Moloch and AI by LivBoeree
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: TED talk on Moloch and AI, published by LivBoeree on November 16, 2023 on The Effective Altruism Forum.
Hey folks, Liv Boeree here - I recently did a TED talk on Moloch (a.k.a the multipolar trap) and how it threatens safe AI development. Posting it here to a) raise awareness and b) get feedback from the community, given the relevancy of the topic.
And of course, if any of you are active on social media, I'd really appreciate it being shared as widely as possible, thank you!
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Nov 16, 2023 • 10min
EA - Some more marginal Long-Term Future Fund grants by calebp
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Some more marginal Long-Term Future Fund grants, published by calebp on November 16, 2023 on The Effective Altruism Forum.
These fictional grants represent the most promising applications we turn down due to insufficient funding.
Throughout the text, 'I' refers to Caleb Parikh, and 'we' refers to both Caleb Parikh and Linch Zhang. This reflects the perspectives of two individuals who are very familiar with the Long-Term Future Fund (LTFF). However, others associated with the LTFF might not agree that this accurately represents their impression of the LTFF's marginal (rejected) grants.
Fictional grants that we rejected but were very close to our funding bar
Each grant is based on 1-3 real applications we have received in the past ~three months. You can see our original LTFF marginal funding post here, and our post on the usefulness of funding the EAIF and LTFF here.[1] Please note that these are a few of the most promising grants we've recently turned down - not the average rejected grant. [2]
($25,000)~ Funding to continue research on a multi-modal chess language model, focusing on alignment and interpretability. The project involves optimizing a data extraction pipeline, refining the model's behaviour to be less aggressive, and exploring ways to modify the model training. Additional tasks include developing a simple Encoder-Decoder chess language model as a benchmark and writing an article about AI safety.
The primary objective is to develop methods ensuring that multi-modal models act according to high-level behavioural priorities. The applicant's background includes experience as a machine learning engineer and chess, competing and developing predictive models. The past year's work under a previous LTFF grant resulted in a training dataset and some initial analysis, laying the groundwork for this continued research.
($25,000) ~ Four months' salary for a former academic to tackle some unusually tractable research problems in disaster resilience after large-scale GCRs. Their work would focus on researching Australia's resilience to a northern hemisphere nuclear war. Their track record included several papers in high-impact factor journals, and their past experiences and networks made them well-positioned for further work in this area. The grantee would also work on public outreach to inform the Australian public about nuclear risks and resilience strategies.
($50,000)~ Six months of career transition funding to help the applicant enter a technical AI safety role. The applicant has seven years of software engineering experience at prominent tech companies and aims to pivot his career towards AI safety. They'll focus on interpretability experiments with Leela Go Zero during the grant.
The grant covers 50% of his previous salary and will facilitate upskilling in AI safety, completion of technical courses, and preparation for interviews with AI safety organizations. He has pivoted his career successfully in the past and has been actively engaged in the effective altruism community, co-running a local group and attending international conferences. This is his first funding request.
($40,000)~ Six months dedicated to exploring and contributing to AI governance initiatives, focusing on policy development and lobbying in Washington, D.C. The applicant seeks to build expertise and networks in AI governance, aiming to talk with over 50 professionals in the field and apply to multiple roles in this domain. The grant will support efforts to increase the probability of the U.S. government enacting legislation to manage the development of frontier AI technologies.
The applicant's background includes some experience in AI policy and a strong commitment to effective altruism principles. The applicant has fewer than three years of professional experience and an undergraduate degree ...


