Check Your Fucking Sources, People

· books · Source ↗

TLDR

  • Unattended AI fabricated specific code review statistics attributed to a real SmartBear/Cisco study, with the actual paper containing none of the claimed numbers.

Key Takeaways

  • A viral LinkedIn post misrepresented a one-time Swedish crow-trash pilot as a national smart-city program, detectable with a single search.
  • A cited “SmartBear/Cisco study” claim – “defect detection drops from 87% for PRs under 100 lines to 28% for PRs over 1,000 lines” – does not exist in the actual paper.
  • The real study measures inspection rate (LOC/hour) and defect density, not defect detection percentages; the specific figures were hallucinated.
  • Link chains laundering AI-generated claims are self-reinforcing: future LLMs will treat the fake article as a credible source, compounding drift.
  • AI hallucination risk is highest at the fringes – narrow research domains with few training examples, exactly where precise citation matters most.

Hacker News Comment Review

  • Commenters were skeptical that source-checking is a solved problem: LLMs handle broad discovery well but hallucinate when starting from a specific claim with thin training coverage.
  • A thread noted that link credibility was already broken before AI – the crow story is a symptom of incentive structures, not a new failure mode.
  • Discussion drifted into fact-checker bias and political post-truth examples, suggesting readers see this as a systemic epistemics problem, not just an AI tool misuse.

Notable Comments

  • @wing-_-nuts: “Ironically, ‘source checking’ is something AI is quite good at” – disputed by a reply noting the distinction between broad discovery and claim-first hallucination.
  • @flail: Draws the precise failure boundary: ask AI for reading suggestions and it performs well; feed it a specific claim to validate and it hallucinates into the gap.

Original | Discuss on HN