
LessWrong (30+ Karma) “Adding Typos Made Haiku’s Accuracy Go Up” by bira
We are curious if large language models behave consistently when user prompts contain typos. To explore this, we ran a small experiment injecting typos into BigCodeBench and evaluated several Claude models under increasing noise levels. As the typo rate rose to 16%, Opus’ accuracy dropped by 9%. Surprisingly, Haiku's accuracy increased by 22%.
This post examines this unexpected “typo uplift” phenomenon and explores why noise appears to help certain models.
Do Typos Make Haiku Try Harder?
We first hypothesize that Haiku's capabilities increased because harder-to-read text makes Haiku think harder. This aligns with observed results in humans that difficult fonts make students retain knowledge better, as it forces them to expend more effort. As a proxy for effort, we plotted the number of output tokens generated by both models[1]. Contrary to our hypothesis, the number of output tokens decreased by typo rate.
Typos don't make models think harder. As typo rates increase, the output lengths of Haiku and Opus go down.
The Anomaly is Haiku-Specific
We then tested if other small models have this typo uplift anomaly. We found that both Haiku 3.5 and 4.5 have this effect of increased accuracy as typos increase, while other smaller models from [...]
---
Outline:
(00:54) Do Typos Make Haiku Try Harder?
(01:34) The Anomaly is Haiku-Specific
(02:08) The Anomaly is Benchmark-Specific
(02:42) The Culprit
(04:02) Takeaways for the Eval Engineer
(04:06) Not all grading harnesses are created equal
(04:48) Scores are lower bounds
(05:15) Aligning the model to the eval
(05:43) Appendix
The original text contained 2 footnotes which were omitted from this narration.
---
First published:
March 16th, 2026
Source:
https://www.lesswrong.com/posts/tcic5c3BJuh3PybDZ/adding-typos-made-haiku-s-accuracy-go-up-1
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
