How Claude Code works in large codebases: Best practices and where to start

· ai coding databases · Source ↗

TLDR

  • Anthropic documents the harness patterns (CLAUDE.md, hooks, skills, plugins, MCP servers, LSP, subagents) that define Claude Code performance in monorepos and legacy systems at scale.

Key Takeaways

  • Claude Code navigates codebases via live filesystem traversal and grep, not RAG embeddings, avoiding stale-index failures common in large active repos.
  • The “harness” (CLAUDE.md files, hooks, skills, plugins, MCP servers) shapes real-world performance more than model benchmarks alone.
  • CLAUDE.md files should be lean and layered: root file for critical gotchas, subdirectory files for local conventions; bloated files degrade session performance.
  • LSP integrations give Claude symbol-level precision (go-to-definition, find-all-references) especially critical for C, C++, and multi-language monorepos.
  • Subagents split exploration from editing: a read-only subagent maps a subsystem and writes findings to a file before the main agent edits, preserving context.

Hacker News Comment Review

  • Commenters broadly agree that the harness taxonomy is real but incomplete in practice: rule-following via CLAUDE.md and skills is unreliable, with models routinely ignoring conditional instructions mid-session, making investment in rules feel risky.
  • The agentic-search-vs-RAG framing drew skepticism. Several noted JetBrains IDEs like PHPStorm maintain accurate, fast indexes without stale-result problems, undercutting the article’s dismissal of indexing approaches.
  • A recurring practical complaint: Claude Code reads only the first ~40 lines of files by default for context preservation, which degrades quality on large files; one user’s Claude self-corrected to AST-based analysis only after detecting its own poor results.

Notable Comments

  • @MaxikCZ: Rules-based reliability remains unsolved – “you just cant trust that the time you spend building rules will actually pay off.”
  • @mikepurvis: Proposes file-skimming subagents as the right fix for the peephole file-reading problem, summarizing files and flagging relevant sections before the main agent acts.

Original | Discuss on HN