Are LLMs Insufficently Lazy

42 snips

May 3, 2026

Greg (gregorein), a Polish software engineer and critic of AI-generated code, walks through his audit of a high-profile AI-built site. He explains the HAR findings, the viral fallout, and how he used Claude to automate part of the review. The conversation explores where LLMs should be used, the risks of measuring output by LOC, and why human review and minimal elegant solutions still matter.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

INSIGHT

Betty Crocker Effect Explains LLM Overclaiming

CEOs discovering LLMs often fall into a Betty Crocker/IKEA effect where minimal input creates a false sense of ownership.
Greg compared Gary Tan's enthusiasm to reintroducing a single step (crack an egg) that made users feel they “made” the product.

INSIGHT

Lines Of Code And Tokens Create Perverse Incentives

Automated metrics like lines-of-code and token-usage produce perverse incentives that reward volume over quality.
Greg audited Gary Tan's G-Stack and found claims of 37,000 lines/day, with commits rewriting ~40% of code, exposing meaningless volume metrics.

ADVICE

Use LLMs For Audits Not Whole-Cloth Production

Use LLMs as reviewers or auditors rather than sole creators to expose issues while keeping human judgment in the loop.
Greg had Claude parse a HAR and saved site output to produce a 500-line audit revealing accessibility and legal problems.

Get the Snipd Podcast app to discover more snips from this episode

Get the app