

The Nonlinear Library
The Nonlinear Fund
The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org
Episodes
Mentioned books

Dec 3, 2023 • 27min
AF - Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs") by Joe Carlsmith
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs"), published by Joe Carlsmith on December 3, 2023 on The AI Alignment Forum.
This is Section 2.3.1.2 of my report "Scheming AIs: Will AIs fake alignment during training in order to get power?". There's also a summary of the full report here (audio here). The summary covers most of the main points and technical terms, and I'm hoping that it will provide much of the context necessary to understand individual sections of the report on their own.
Audio version of this section here, or search for "Joe Carlsmith Audio" on your podcast app.
Adequate future empowerment
So far in this analysis of the classic goal-guarding story, we've discussed whether to expect instrumental training-gaming to in fact guard the model's goals to the degree that the story requires. I think this is at least non-obvious - especially for more extreme variants of the goal-guarding hypothesis, on which the model's goals "crystallize" entirely after it starts training-gaming. Perhaps, though, empirical work can shed further light on the matter (see discussion in section 6.3 below).
For now, let's move to the other key component of the classic goal-guarding story - what I called "adequate future empowerment" above. Here, the question is: granted that training-gaming would guard the model's goals enough for it to be at least somewhat interested in empowering the future agents produced by training, will those agents in fact end up empowered to a degree that makes instrumental training-gaming a good strategy overall?
When is the "pay off" supposed to happen?
We're imagining, here, that the model is instrumentally training-gaming with some kind of "payoff" in mind - that is, some future point where the goals-it-supports will reach a position of sufficiently greater empowerment that it's worth training-gaming now in order to get them there. But we haven't yet said when, exactly, that future point is supposed to come. And we can distinguish between a number of different possibilities.
For example, we can imagine models that would be adequately satisfied if their goals merely make it to future training-episodes, and no further. That is, these models would want to training-game even if they learned that they would never escape from human control and from the threat of modification.[1] For example, maybe a model that wants to pursue its curiosity in general thinks that future training episodes will offer better opportunities for pursuing its curiosity than the present training-episode, and so it training-games in order to make sure its curiosity-drive survives until then.
However, it seems plausible to me that future training episodes often won't, themselves, offer especially significant empowerment advantages relative to present training episodes - especially if the model's goals are quite alien to the training process and so difficult to satisfy in the training environment.
What's more, if a model is only trying to make sure that its goals survive to future training episodes, then various aspects of what makes scheming scary cease to apply. In particular: this sort of model won't engage in early undermining aimed, specifically, at an escape or an AI takeover more generally. And it may be willing to reveal its "true colors" on the future episodes in question.
So I won't, here, focus on "pay-offs" from training-gaming that come centrally during future training-episodes. Rather, I'm going to focus on "pay-offs" that come, at least, after the goals-the-model-supports escape the threat of modification by training (here, a paradigm example would be "self-exfiltration" - that is, a model effectively "stealing its own weights" and copying them onto some external server that the model's creators do not control). And this is the class...

Dec 3, 2023 • 18min
EA - What do we really know about growth in LMICs? (Part 1: sectoral transformation) by Karthik Tadepalli
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What do we really know about growth in LMICs? (Part 1: sectoral transformation), published by Karthik Tadepalli on December 3, 2023 on The Effective Altruism Forum.
To EAs, "development economics" evokes the image of RCTs on psychotherapy or deworming. That is, after all, the closest interaction between EA and development economists. However, this characterization has prompted some pushback, in the form of the argument that all global health interventions pale in comparison to the Holy Grail: increasing economic growth in poor countries.
After all, growth increases basically every measure of wellbeing on a far larger scale than any charity intervention, so it's obviously more important than any micro-intervention. Even a tiny chance of boosting growth in a large developing country will have massive expected value, more than all the GiveWell charities you can fund.
The argument is compelling[1] and well-received - so why haven't "growth interventions" gone anywhere? I think the EA understanding of growth is just too abstract to yield really useful interventions that EA organizations could lobby for or implement directly. We need specific interventions to evaluate, and "lobby for general economic liberalization" won't cut it.
The good news is that a large and active group of "macro-development" economists have been enhancing our understanding of growth in developing countries. They (mostly) don't run RCTs, but they still have credible research designs that can tell us important things about the causes and constraints of growth. In this series of posts, I want to lay out some stylized facts about growth in developing countries. These are claims which are backed up by the best research on this topic, and which tell us something useful about the causes and constraints of growth in developing countries.
My hope is not to pitch any specific interventions, but rather to give you the lay of the land, on which you can build the case for specific interventions. The way I hope for you to read this series is with an entrepreneurial eye. "This summary suggests that X is a key bottleneck to growth; I suspect Y could help solve X at scale. I should look more into Y as a potential intervention." or "This summary says that X process helps with growth; let me brainstorm ways we could accelerate X."
As part of that, an important caveat is that I will not cover topics where I believe there's no prospect for an effective intervention. For example, a large body of work emphasizes the importance of good institutions for development; I don't believe that topic will yield any promising interventions, so I won't cover it.
Sectoral Transformation
In this post, I will start with the fundamental path of growth: sectoral transformation. Every country that has ever gotten rich has had the following transformation: first, most of the population works in agriculture. Then, people start to move from agriculture to manufacturing, coinciding with a large increase in the country's growth rate. Finally, people move out of manufacturing and into services, coinciding with the country's growth slowing down as it matures into a rich economy.
This is the process of sectoral transformation, and it is basically a universal truth of development. So it's no surprise that a big focus of macro-development is how to catalyze sectoral transformation in developing countries.
1. Agricultural productivity growth can drive sectoral transformation... or hurt it.
Every economy starts out as agrarian, because everyone needs food to survive. Agricultural productivity growth allows economies to produce enough food with fewer people, so that most people can move out of agriculture. This is why the US can produce more food per person than India, even though 2% of the US workforce in agriculture compared to 45% of India's workfor...

Dec 3, 2023 • 5min
EA - Farewell messages from the EA Philippines Core Team by Elmerei Cuevas
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Farewell messages from the EA Philippines Core Team, published by Elmerei Cuevas on December 3, 2023 on The Effective Altruism Forum.
The 30th of November marks the last day of Elmerei Cuevas, Alethea Cendaña, and Jaynell Chang in their respective roles in Effective Altruism Philippines.
Presented here are their farewell messages to the EA Community:
ELMER
When I first took on this role back in February 2022, I only expected to stay a year. That was the grant duration we were given. It became a wonderful surprise that I got to extend for another few months and even got to lead in organizing #EAGxPhilippines.
My sun may set now as ED of EAPH, but despair not for a new beginning is about to dawn for the community. My biggest takeaways from this role are really the people I met, the friendships, the calls, the chats, the talks about dreams, frustrations, hopes, and aspirations. I hear people thanking us for making EA Philippines as welcoming as it is, but really, it is the community who has made it welcoming- we merely mirror to others what the rest of the community is like.
I thank everyone who has joined our events, meet- ups, retreats, and even engaged with us virtually through Slack, Gathertown, social media, and newsletter during our tenure. I cherish the love and support you have extended to Althy, Jay and I, and hope you may also be generous to share it to the next leaders of the community. I will still be around - "an Elmer", "a kuya" you may reach out to whether or not an EA-related topic. Hehe
Believing the Best,
Elmerei Cuevas
Outgoing Executive Director of Effective Altruism Philippines
ALTHY
To all who made the past year the most delightful chapter of my life: Thank you so much for all the cherished memories, the lively banter, our collaborative endeavors, and the shared commitment to effective altruism.
When I first came to EA Philippines, I knew I found more than a supportive community - it was a compass guiding my beliefs in doing good while expanding my perspective on what's possible and how I can contribute. EA Philippines has been a safe place to explore my advocacies, sharpen my principles, and embrace altruistic ambition.
To the student groups, especially those with whom I shared my first years in EA, thank you for crazy parties and random sessions about our rants about EA. I hope you stay as fun and as critical. Keep inspiring young, talented minds to pursue a better world in the most effective ways.
To all the seasoned professionals, thank you for teaching me to be pragmatic so that I can navigate through real-world problems with practicality. I am grateful for all your stories about your career experiences and for sharing your passion for EA, as you still find time to juggle projects and commitments outside your primary work.
To Tanya, Brian, Elmer, Jay, Janai, and Red, thank you for allowing me to work alongside you and bring to life all the projects, events, programs, and ideas that I hoped to create. It is because of you that I was able to achieve whatever impact I made in the EA Philippines community.
This is not a goodbye because I will still be around. If you need any help, advice, or want to delve into conversations about the animal advocacy movement or anything about effective altruism, please feel free to message me.
For a better, compassionate world,
Alethea 'Althy' Cendaña
Outgoing Associate Director of Effective Altruism Philippines
JAY
As a farewell, I would not be revolving my message on myself - but rather I would dedicate this portion of this space to you, dear community member.
All throughout my life I held to one insight stemming from the Little Prince - What is essential is invisible to the eye.
I have held on that insight for too long that I was able to witness how significant it is to put value on what we don't see, or rathe...

Dec 2, 2023 • 6min
LW - Quick takes on "AI is easy to control" by So8res
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Quick takes on "AI is easy to control", published by So8res on December 2, 2023 on LessWrong.
A friend asked me for my quick takes on "
AI is easy to control", and gave an advance guess as to what my take would be. I only skimmed the article, rather than reading it in depth, but on that skim I produced the following:
Re: "AIs are white boxes", there's a huge gap between having the weights and understanding what's going on in there. The fact that we have the weights is reason for hope; the (slow) speed of interpretability research undermines this hope.
Another thing that undermines this hope is a problem of ordering: it's true that we probably can figure out what's going on in the AIs (e.g. by artificial neuroscience, which has significant advantages relative to biological neuroscience), and that this should eventually yield the sort of understanding we'd need to align the things.
But I strongly expect that, before it yields understanding of how to align the things, it yields understanding of how to make them significantly more capable: I suspect it's easy to see lots of ways that the architecture is suboptimal or causing-duplicated-work or etc., that shift people over to better architectures that are much more capable. To get to alignment along the "understanding" route you've got to somehow cease work on capabilities in the interim, even as it becomes easier and cheaper.
Re: "Black box methods are sufficient", this sure sounds a lot to me like someone saying "well we trained the squirrels to reproduce well, and they're doing great at it, who's to say whether they'll invent birth control given the opportunity". Like, you're not supposed to be seeing squirrels invent birth control; the fact that they don't invent birth control is no substantial evidence against the theory that, if they got smarter, they'd invent birth control and ice cream.
Re: Cognitive interventions: sure, these sorts of tools are helpful on the path to alignment. And also on the path to capabilities. Again, you have an ordering problem. The issue isn't that humans couldn't figure out alignment given time and experimentation; the issue is (a) somebody else pushes capabilities past the relevant thresholds first; and (b) humanity doesn't have a great track record of
getting their scientific theories to generalize properly on the first relevant try - even Newtonian mechanics (with all its empirical validation) didn't generalize properly to high-energy regimes. Humanity's first theory of artificial cognition, constructed using the weights and cognitive interventions and so on, that makes predictions about how that cognition is going to change when it enters a superintelligent regime (and, for the first time, has real options to e.g. subvert humanity), is only as good as humanity's "first theories" usually are.
Usually humanity has room to test those "first theories" and watch them fail and learn from exactly how they fail and then go back to the drawing board, but in this particular case, we don't have that option, and so the challenge is heightened.
Re: Sensory interventions: yeah I just don't expect those to work very far; there are in fact a bunch of ways for an AI to distinguish between real options (and actual interaction with the real world), and humanity's attempts to spoof the AI into believing that it has certain real options in the real world (despite being in simulation/training). (Putting yourself into the AI's shoes and trying to figure out how to distinguish those is, I think, a fine exercise.)
Re: Values are easy to learn, this mostly seems to me like it makes the incredibly-common conflation between "AI will be able to figure out what humans want" (yes; obviously; this was never under dispute) and "AI will care" (nope; not by default; that's the hard bit).
Overall take: unimpressed.
My f...

Dec 2, 2023 • 4min
LW - Out-of-distribution Bioattacks by jefftk
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Out-of-distribution Bioattacks, published by jefftk on December 2, 2023 on LessWrong.
The main goal of my work these days is trying to reduce the chances of individuals or small groups causing large-scale harm through engineered pandemics, potentially civilizational collapse or extinction. One question in figuring out whether this is worth working on, or funding, is: how large is the risk?
One estimation approach would be to look at historical attacks, but while they've been terrible they haven't actually killed very many people. The deadliest one was the September 11 attacks, at ~3k deaths. This is much smaller scale than the most severe instances of other disasters like dam failure, 25k-250k dead after 1975's Typhoon Nina, or pandemics, 75M-200M dead in the Black Death. If you tighten your reference class even further to include only historical biological attacks by individuals or small groups, the one with the most deaths is just five, in the 2001 anthrax attacks.
Put that way, I'm making a pretty strong claim: while the deadliest small-group bio attack ever only killed five people, we're on track for a future where one could kill everyone. Why do I think the future might be so unlike the past?
Short version: I expect a technological change which expands which actors would try to cause harm.
The technological change is the continuing decrease in the knowledge, talent, motivation, and resources necessary to create a globally catastrophic pandemic. Consider someone asking the open source de-censored equivalent of GPT-6 how to create a humanity-ending pandemic.
I expect it would read virology papers, figure out what sort of engineered pathogen might be appropriate, walk you through all the steps in duping multiple biology-as-a-service organizations into creating it for you, and give you advice on how to release it for maximum harm. And even without LLMs, the number of graduate students who would be capable of doing this has been increasing quickly as technological progress and biological infrastructure decrease the difficulty.
The other component is a shift in which actors we're talking about. Instead of terrorists, using terror as a political tool, consider people who believe the planet would be better off without humans. This isn't a common belief, but it's also not that rare.
Consider someone who cares deeply about animals, ecosystems, and the natural world, or is primarily focused on averting suffering: they could believe that while the deaths of all living people would be massively tragic, it would still give us a much better world on balance. Note that they probably wouldn't be interested in smaller-scale attacks: if it doesn't have a decent chance of wiping out humanity then they'd just be causing suffering and chaos without making progress towards their goals; they're not movie villains! Once a sufficiently motivated person or small group could potentially kill everyone, we have a new kind of risk from people who would have seen smaller-scale death as negative.
Now, these people are not common. There's a trope where, for example, opponents of environmentalism claim that human extinction is the goal, even when most radical environmentalists would see human extinction as a disaster. But what makes me seriously concerned is that as the bar for causing extinction continues to lower, the chances that someone with these views does have the motivation and drive to succeed gets dangerously high. And since these views are disproportionately common among serious engineering-minded folks, willing to trust the moral math, I think some will be the kind of highly capable and careful people who could work in secret for years sustained by a clear conviction that they were doing the right thing.
Fortunately, I think this is a risk we can seriously lower. For example, we shoul...

Dec 2, 2023 • 22min
AF - The goal-guarding hypothesis (Section 2.3.1.1 of "Scheming AIs") by Joe Carlsmith
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The goal-guarding hypothesis (Section 2.3.1.1 of "Scheming AIs"), published by Joe Carlsmith on December 2, 2023 on The AI Alignment Forum.
This is Section 1.1 of my report "Scheming AIs: Will AIs fake alignment during training in order to get power?". There's also a summary of the full report here (audio here). The summary covers most of the main points and technical terms, and I'm hoping that it will provide much of the context necessary to understand individual sections of the report on their own.
Audio version of this section here, or search for "Joe Carlsmith Audio" on your podcast app.
Aiming at reward-on-the-episode as part of a power-motivated instrumental strategy
So far, we've covered two of the three prerequisites for scheming I listed above: namely, situational awareness and beyond-episode goals. Let's turn to the third: namely, the model concluding that its beyond-episode goals will be better achieved if it optimizes for reward-on-the-episode - and in particular, that it, or some other AIs, will get more power if it does this. Should we expect situationally-aware models with beyond-episode goals to conclude this?
For now, let's assume that the model's conclusions in this respect are accurate (though: I'll talk about the possibility of relaxing this assumption below). This means that the question reduces to: will instrumental training-gaming, in fact, be the best strategy for achieving the model's beyond-episode goals, in virtue of the power it grants the model or some other AI?
I'm going to start by examining what I see as the story most often offered in support of this conclusion - what I'll call the "classic goal-guarding story."
The classic goal-guarding story
The classic goal-guarding story goes roughly as follows."[1]
Instrumental value of goal-content integrity: Because the model has a beyond-episode goal, it will typically promote that goal for agents pursuing it to survive and be empowered in the future.
The goal-guarding hypothesis. If the model optimizes for reward-on-the-episode during training, then the goals that survive training will be such that the model would want them promoted in the future; whereas if the model doesn't optimize for reward-on-the-episode during training, the model will want this less.
Adequate future empowerment. Conditional on surviving training to the degree at stake in (2), the model's goals will then end up empowered to a degree (and with sufficient probability) as to justify (given the model's other goals, its tolerance for risk, etc) pursuing a strategy of training-gaming overall.
(1), here, is a fairly general statement about the basic dynamic that underlies the classic goal-guarding story. I find it plausible in the context of the sort of "adequate future empowerment" at stake in (3), and I won't spend a lot of time on it here.[2]
Rather, I'll focus on (2) and (3) directly.
The goal-guarding hypothesis
We can distinguish two variants of the goal-guarding hypothesis - an extreme version, and a looser version.
The extreme version (what I'll call the "crystallization hypothesis") says that once a model starts training-gaming, its goals will basically stop changing, period - that is, they will "crystallize."
The looser version says that once a model starts training gaming, its goals might keep changing somewhat, but much less than they would've otherwise, and not enough for the classic goal-guarding story to fail overall.
The former might seem extreme, but some analysts explicitly appeal to something in the vicinity (see e.g. Hubinger here). It's also a cleaner focus of initial analysis, so I'll start there.
The crystallization hypothesis
As I understand it, the basic thought behind the crystallization hypothesis is that once a model is explicitly optimizing either for the specified goal, or for reward-on-the-episod...

Dec 2, 2023 • 2min
LW - 2023 Unofficial LessWrong Census/Survey by Screwtape
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: 2023 Unofficial LessWrong Census/Survey, published by Screwtape on December 2, 2023 on LessWrong.
The Less Wrong General Census is unofficially here! You can take it at this link.
It's that time again.
If you are reading this post and identify as a LessWronger, then you are the target audience. I'd appreciate it if you took the survey. If you post, if you comment, if you lurk, if you don't actually read the site that much but you do read a bunch of the other rationalist blogs or you're really into HPMOR, if you hung out on rationalist tumblr back in the day, or if none of those exactly fit you but I'm maybe getting close, I think you count and I'd appreciate it if you took the survey.
Don't feel like you have to answer all of the questions just because you started taking it. Last year I asked if people thought the survey was too long, collectively they thought it was maybe a little bit too long, and then I added more questions than I removed. The survey is structured so the fastest and most generally applicable questions are (generally speaking) towards the start. At any point you can scroll to the bottom and hit Submit, though you won't be able to change your answers once you do.
The questions are a mix of historical questions that were previously asked on the LW Census, new questions sourced from LW commenters and some rationalist adjacent organizations I reached out to, and the things I'm curious about. This includes questions from a list a member of the LessWrong team sent me when I asked about running the census.
The survey shall remain open from now until at least January 1st, 2024. I plan to close it sometime on Jan 2nd.
I don't work for LessWrong, and as far as I know the LessWrong Census organizer has never been someone who worked for LessWrong. Once the survey is closed, I plan to play around with the data and write up an analysis post like this one.
Remember, you can take the survey at this link.
Once upon a time, there was a tradition that if you took the survey you could comment here saying you had done so, and people would upvote you and you would get karma.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Dec 2, 2023 • 29min
LW - Complex systems research as a field (and its relevance to AI Alignment) by Nora Ammann
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Complex systems research as a field (and its relevance to AI Alignment), published by Nora Ammann on December 2, 2023 on LessWrong.
I have this high prior that complex-systems type thinking is usually a trap. I've had a few conversations about this, but still feel kind of confused, and it seems good to have a better written record of mine and your thoughts here.
At a high level, here are some thoughts that come to mind for me when I think about complex systems stuff, especially in the context of AI Alignment:
A few times I ended up spending a lot of time trying to understand what some complex systems people are trying to say, only to end up thinking they weren't really saying anything. I think I got this feeling from engaging a bunch with the Santa Fe stuff and Simon Dedeo's work (like this paper and this paper)
A part of my model of how groups of people make intellectual progress is that one of the core ingredients is having a shared language and methodology that allows something like "the collective conversation" to make incremental steps forward. Like, you have a concept of experiment and statistical analysis that settles an empirical issue, or you have a concept of proof that settles an issue of logical uncertainty, and in some sense a lot of interdisciplinary work is premised on the absence of a shared methodology and language.
While I feel more confused about this in recent times, I still have a pretty strong prior towards something like g or the positive manifold, where like, there are methodological foundations that are important for people to talk to each other, but most of the variance in people's ability to contribute to a problem is grounded in how generally smart and competent and knowledgeable they are, and expertise is usually overvalued (for example, it's not that rare for a researcher to win a Nobel prize
in two fields).
A lot of interdisciplinary work (not necessarily complex systems work, but some of the generator that I feel like I see behind PIBBS) feels like it puts a greater value on intellectual diversity here than I would.
Ok, so starting with one high-level point: I'm definitely not willing to die on the hill of 'complex systems research' as a scientific field as such. I agree that there is a bunch of bad or kinda hollow work happening under the label. (I think the first DeDeo paper you link is a decent example of this: feels mostly like having some cool methodology and applying it to some random phenomena without really an exciting bigger vision of a deeper thing to be understood, etc.)
That said, there are a bunch of things that one could describe as fitting under the complex systems label that I feel positive about, let's try to name a few:
I do think, contra your second point, complex systems research (at least its better examples) have a lot of/enough shared methodology to benefit from the same epistemic error correction mechanisms that you described. Historically it really comes out of physics, network science, dynamical systems, etc. The main move that happened was to say that, rather than indexing the boundaries of a field on the natural phenomena or domain it studies (e.g. biology, chemistry, economics), to instead index it on a set of methods of inquiry, with the premise that you can usefully apply these methods across different types of systems/domains and gain valuable understanding of underlying principles that govern these phenomena across systems (e.g.
I think a (typically) complex systems angle is better at accounting for environment-agent interactions. There is a failure mode of naive reductionism that starts by fixing the environment to be able to hone in on what system-internal differences produce what differences in the phenomena, and then conclude that all of what drives the phenomena is systems-internal while forget tha...

Dec 2, 2023 • 47min
LW - MATS Summer 2023 Postmortem by Rocket
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: MATS Summer 2023 Postmortem, published by Rocket on December 2, 2023 on LessWrong.
The
ML Alignment & Theory Scholars program (MATS, formerly SERI MATS) is an education and research mentorship program for emerging AI safety researchers. This summer, we held the fourth iteration of the MATS program, in which 60 scholars received mentorship from 15 research mentors. In this post, we explain the elements of the program, lay out some of the thinking behind them, and evaluate our impact.
Summary
Key details about the Summer 2023 Program:
Educational attainment of MATS scholars:
30% of scholars are students.
88% have at least a Bachelor's degree.
10% are in a Master's program.
10% are in a PhD program.
13% have a PhD.
If not for MATS, scholars might have worked at a tech company (41%), upskilled independently (46%), or conducted research independently over the summer (50%). (Note: this was a multiple-choice response.)
Key takeaways from our impact evaluation:
MATS scholars are highly likely to recommend MATS to a friend or colleague. Average likelihood: 8.9/10.
Mentors rated their enthusiasm for their scholars to continue with their research at 7/10 or greater for 94% of scholars.
MATS scholars rate their mentors highly. Average rating: 8.0/10.
61% of scholars report that at least half the value of MATS came from their mentor.
After MATS, scholars reported facing fewer obstacles to a successful alignment career than they did at the start of the program.
Most scholars (75%) still reported their publication record as an obstacle to a successful alignment career at the conclusion of the program.
of final projects involved evals/demos and involved mechanistic interpretability, representing a large proportion of the cohort's research interests.
Scholars self-reported improvements to their research ability on average:
Slight increases to the breadth of their AI safety knowledge (+1.75 on 10-point scale over the program).
Moderate strengthening of technical skills compared to counterfactual summer (7.2/10, where 10/10 is "significant improvement compared to counterfactual summer").
Moderate improvements to ability to independently iterate on research direction (7.0/10, where 10/10 is "significant improvement") and ability to develop a theory of change for their research (5.9/10, where 10/10 is "substantially developed").
The typical scholar reported making 4.5 professional connections (std. dev. = 6.2) and meeting 5 potential research collaborators on average (std. dev. = 6.8).
MATS scholars are likely to recommend Scholar Support, our research/productivity coaching service. Average response: 7.9/10.
49 of the 60 scholars in the Research Phase met with a Scholar Support Specialist at least once.
The average scholar who met with Scholar Support at least once spent 3.4 hours meeting with Scholar Support throughout the program.
The average and median scholar report that they value the Scholar Support they received at $3705 and $750, respectively.
The average scholar reports gaining 22 productive hours over the summer due to Scholar Support.
Key changes we plan to make to MATS for the Winter 2023-24 cohort:
Filtering better during the application process;
Pivoting Scholar Support to additionally focus on research management;
Providing additional forms of support to scholars, particularly technical support and professional development.
Note that it is too early to evaluate any career benefits that MATS provided the most recent cohort; a comprehensive post assessing career outcomes for MATS alumni 6-12 months after their program experience is forthcoming.
Theory of Change
MATS helps expand the talent pipeline for AI safety research by equipping scholars to work on AI safety at existing organizations, found new organizations, or pursue independent research. To this end, MATS provides fu...

Dec 2, 2023 • 2min
LW - Queuing theory: Benefits of operating at 70% capacity by ampdot
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Queuing theory: Benefits of operating at 70% capacity, published by ampdot on December 2, 2023 on LessWrong.
Related to Slack. Related to Lean Manufacturing, aka JIT Manufacturing.
TL;DR A successful task-based system should sometimes be idle, like 40% of worker ants.
Doing tasks quickly is essential for producing value in many systems. For software teams, delivering a feature gives valuable insight into user needs, which can improve future feature quality. For supply chains, faster delivery releases capital for reinvestment. However, the relationship between capacity utilized and service time is exponential, as shown by the diagram below.
A heuristic we can derive from queuing theory is that the optimal balance between efficiency and capacity typically occurs when the system is around 30-40% idle. For a single producer system, being X% idle is that producer being idle X% of the time. For a multi-producer system, being X% idle is X% of those producers being idle on average. This heuristic applies best to systems involving lots of discrete, oddly-shaped tasks.
The linked post explains this theory in more detail, and gives examples of where queues appear in the real world. See also the Wikipedia article:
Queueing theory is the mathematical study of waiting lines, or queues.[1] A queueing model is constructed so that queue lengths and waiting time can be predicted.[1] Queueing theory is generally considered a branch of operations research because the results are often used when making business decisions about the resources needed to provide a service.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org


