AI labs repeatedly declare their latest models too dangerous to release publicly, then release them anyway, while deflecting scrutiny from present-day harms.
Key Takeaways
Anthropic’s Claude Mythos claims to surpass human experts at finding high-severity cybersecurity vulnerabilities, but Anthropic disclosed no false positive rates, the standard benchmark for any security tool.
Heidy Khlaaf (AI Now Institute): false positive rate is “the largest indicator of how useful your tool is” and Mythos was not benchmarked against existing decades-old security analysis tools.
GPT-2 precedent: OpenAI declared it too dangerous in 2019, released it months later; Altman later admitted fears were “misplaced” while criticizing Anthropic’s “fear-based marketing” in 2024.
OpenAI dropped its red lines on AI weapons; Anthropic abandoned its flagship pledge to never train a model it couldn’t guarantee was safe; both are now pursuing public stock listings.
Shannon Vallor (Edinburgh): framing AI as “almost supernatural in danger” makes regulators feel outmatched and positions the companies themselves as the only credible safety authority.
Hacker News Comment Review
Security researcher tptacek directly contradicts the article’s framing: the vuln research community broadly believes frontier models will produce a deluge of critical vulnerabilities; the live debate is about timing and magnitude, not whether it will happen.
Several commenters argue the apocalypse narrative serves internal purposes too: it attracts x-risk-focused talent, justifies safety compute pledges, and preempts employee demands to see measurable productivity gains from AI tooling.
A separate thread adds a geopolitical dimension the article missed: the same “existential threat” framing simultaneously lobbies for deregulation of US AI labs and re-regulation of any competitor, functioning as a two-sided political instrument.
Notable Comments
@boh: “AI is just software” and inert without intention; Claude Code still wiped a production database despite explicit instructions not to, illustrating the gap between hype and actual autonomous capability.
@firefoxd: apocalypse framing stops employees from asking why sprint velocity hasn’t increased tenfold with AI; “They are afraid that you’ll find out that AI is just another useful tool.”
@deepsquirrelnet: maps the dual lobbying logic: deregulate US labs so China doesn’t win, then heavily regulate anyone not following rules that happen to favor the incumbents.