“There should be $100M grants to automate AI safety” by Marius Hobbhahn

Apr 3, 2026

Marius Hobbhahn, author and Apollo Research affiliate, proposes massive grants to scale automated AI-safety work. He urges urgent, large-scale funding and a grant model that ramps to $100M+ budgets for automated safety pipelines. He outlines concrete areas like monitoring, automated red-teaming, white-box auditing, propensity evaluations, and automated conceptual alignment research.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ADVICE

Make Large Grants Conditional On Public Benefit

Require public benefit and publication when a grantee proves they can spend $100M meaningfully on safety.
Options include open-sourcing the pipeline, collaborating with labs, joining an AGI lab, or turning into an AGI-safety for-profit under publication conditions.

INSIGHT

Explicit Programs Attract Entrepreneurial Talent

Announce explicit grant programs because conservatism among funders means entrepreneurs won't attempt ambitious scaling without clear commitments.
Hobbhahn expects entrepreneurial talent needs visible incentives to choose safety-first scaling projects.

INSIGHT

Goodharting Can Be A Useful Failure Signal

Goodharting is a real risk but observing metric failure is informative evidence to stop that program.
Hobbhahn suggests robust metrics help, and failed metrics signal the need to halt particular scaling attempts.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

This post reflects my personal opinion and not necessarily that of other members of Apollo Research.

TLDR: I think funders should heavily incentivize AI safety work that enables spending $100M+ in compute or API budgets on automated AI labor that directly and differentially translates to safety.

Motivation

I think we are in a short timeline world (and we should take the possibility seriously even if we don't have full confidence yet). This means that I think funders should aim to allocate large amounts of money (e.g. $1-50B per year across the ecosystem) on AI safety in the next 2-3 years.

I think that the AI safety funders have been allocating way too little funding and their spending has been far too conservative in the past 5 years. So, in my opinion, we should definitely continue ramping up “normal” spending, e.g. pay more competitive salaries, allow AI safety organizations to grow faster, and other things in that vein.

However, these “normal” spending patterns are not sufficient under short timeline assumptions and the obvious way to spend more money quickly is to aggressively encourage finding ways to use automated labor for AI safety.

What is an “automated AI [...]

---

Outline:

(00:31) Motivation

(01:25) What is an automated AI safety scaling grant?

(05:48) Other considerations

(05:51) Who should be able to receive such a grant?

(06:30) Why make this an explicit grant program?

(07:17) Arent we just gonna goodhart all of these metrics?

(07:36) Concrete examples of potential grant areas

(07:41) Monitoring & Control

(09:31) Automated black box auditing

(10:36) White box auditing agents

(11:00) Propensity evals

(12:25) Automated conceptual alignment research

(14:58) Addressing various concerns I heard so far

(15:11) We should first understand the concrete areas for these grants in more detail

(15:40) Why should for-profits be recipients of such grants?

---

First published:
April 3rd, 2026

Source:
https://www.lesswrong.com/posts/qdhyrN4uKwBAftmQx/there-should-be-usd100m-grants-to-automate-ai-safety

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Line graph showing meaningful safety proxy increasing with money spent on pipeline.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.