Project Glasswing: what Mythos showed us

· coding security ai · Source ↗

TLDR

  • Cloudflare ran Mythos Preview (Anthropic) and other security LLMs against 50+ internal repos, documenting exploit chain construction, proof generation, and harness architecture needed for scale.

Key Takeaways

  • Mythos Preview advances beyond prior frontier models by chaining low-severity bugs into working exploits and auto-generating, compiling, and iterating on proof-of-concept code.
  • Organic model refusals are inconsistent: identical tasks framed differently or run at different times produce opposite outcomes, making them insufficient as a safety boundary alone.
  • Signal-to-noise is the dominant operational problem; C/C++ codebases and model hedging inflate false positives, but Mythos PoC-attached findings cut triage time meaningfully.
  • Generic coding agents fail for coverage: context window limits mean a single-agent session covers roughly 0.1% of a large repo’s surface before compaction discards earlier findings.
  • A multi-stage harness fixes this: narrow scoped hunt tasks, adversarial second-agent review, split chain reasoning, and parallel agents with deduplication all measurably improve output quality.

Hacker News Comment Review

  • Commenters most wanted hard numbers: how many vulns were found, how many were real, and how severe were the worst ones. The post provides none of this.
  • The consistency of Mythos-style exploit capability going open or near-frontier was flagged as the bigger systemic risk, with guardrails viewed as a temporary measure rather than a durable defense.
  • Some skepticism that the post itself reads as LLM-generated, though others pushed back noting Cloudflare’s engineering blog has a strong pre-AI track record.

Notable Comments

  • @unethical_ban: argues the real fix is writing code assuming attackers run LLMs against it, not model-side guardrails.
  • @dataflow: “that’s really the most interesting and important bit” on severity of vulns found, which goes unaddressed.

Original | Discuss on HN