Progress under fixed cost caps

Anders models affordable 50% horizons showing doubling every ~3 months even with strict cost caps.

Play episode from 06:08

chevron_right

Transcript

chevron_right

Transcript

Episode notes

METR's frontier time horizons are doubling every few months, providing substantial evidence that AI will soon be able to automate many tasks or even jobs. But per-task inference costs have also risen sharply, and automation requires AI labor to be affordable, not just possible.[1] Many people look at the rising compute bills behind frontier models and conclude that automation will soon become unaffordable.

I think this misreads the data. The rise in inference cost reflects models completing longer tasks, not models becoming more expensive relative to the human labor they replace. Current frontier models complete tasks at their 50% reliability horizon for roughly 3% of human cost, and this hasn’t increased as capabilities have improved.

I define cost ratio as the inference cost of the average AI trajectory that solves a task divided by human cost to complete the same task. Using METR's data, I examine the trend in cost ratios over time. I show three things:

Across successive frontier models, the cost ratio at each model's 50% reliability time horizon hasn't increased.
Among tasks models successfully complete, longer tasks don't have higher cost ratios than shorter ones.
Time horizon trends barely slow when capping AI spending per [...]

---

Outline:

(02:42) Evidence from METR

(03:22) Cost ratio at models 50% time horizon isnt increasing

(04:59) Time horizons improvements arent driven by expensive long tasks

(06:07) Progress at a fixed cost is just as fast

(08:04) Limitations of my methodology

(09:42) Inference scaling will just make progress faster

(11:10) Conclusion

(12:12) Appendix A: Why I get different results from Ord

(15:32) Appendix B: Additional cost at time horizon graphs

(16:45) Appendix C: 80% affordable time horizon