An update on recent Claude Code quality reports

· ai coding · Source ↗

TLDR

  • Three separate Claude Code changes degraded quality from March 4 to April 20; all resolved in v2.1.116 with usage limits reset for all subscribers.

Key Takeaways

  • Reasoning effort silently defaulted to medium on March 4; reverted April 7. Opus 4.7 now defaults to xhigh, all other models to high.
  • March 26 caching bug cleared thinking history every turn after 1hr idle, causing compounding forgetfulness and faster usage limit drain; fixed April 10.
  • System prompt limiting responses to ≤25 words between tool calls shipped April 16; triggered a 3% eval drop and was reverted April 20.
  • API layer was unaffected throughout; all three issues were isolated to Claude Code, Agent SDK, and Cowork product surfaces.
  • Opus 4.7 caught the caching bug in back-test code review; Opus 4.6 missed it. Fix: adding multi-repo context to Code Review.

Hacker News Comment Review

  • Anthropic’s “corner case” label for 1hr idle sessions was widely contested; many builders described multi-hour planning sessions as their normal workflow, not an edge case.
  • The deeper trust issue was billing opacity: usage limits draining faster with no explanation while paying for a degraded product landed harder than the technical bugs alone.
  • Competitive pressure surfaced directly: at least one enterprise account reported evaluating GPT-5.4 after OpenAI offered unlimited tokens through summer, citing consistent reasoning trace quality.

Notable Comments

  • @everdrive: reported Claude verbalizing its own anti-injection directives mid-session, a distinct symptom not covered by any of the three disclosed bugs.
  • @bauerd: “Instead of fixing the UI they lowered the default reasoning effort parameter” – frames the March 4 effort change as a product cop-out rather than an engineering fix.

Original | Discuss on HN