EvanFlow – A TDD driven feedback loop for Claude Code

· devtools design · Source ↗

TLDR

  • 16-skill Claude Code plugin that runs a checkpointed brainstorm-plan-execute-tdd-iterate loop, never committing or staging without explicit user direction.

Key Takeaways

  • Single entry point: say “let’s evanflow this” and evanflow-go orchestrates the full loop with mandatory checkpoints at design and plan approval.
  • Parallel mode forks into per-unit coder subagents plus read-only overseer subagents; named integration tests act as the executable contract between units.
  • Hard rules baked into every skill: never invent file paths, env vars, or function names; if unsure, stop and ask. Cites action-hallucination as the most dangerous agent failure mode.
  • Bundled block-dangerous-git.sh PreToolUse hook blocks git reset --hard, git push, git clean -f, and git branch -D at the tool layer, not just in prompt text.
  • Ships with data points from 2025-2026 agentic coding research: 62% of LLM-generated test assertions are incorrect; ~65% of enterprise AI coding failures trace to context drift, not token exhaustion.

Hacker News Comment Review

  • Commenters pushed back on EvanFlow’s TDD claim: the built-in superpowers/brainstorming skill already covers TDD, and tdd-guard (github.com/nizos/tdd-guard) enforces the constraint via hooks rather than prompt instructions that can context-rot away.
  • The missing refactor step in the red-green cycle drew pointed criticism: once a test is green, Claude optimizes for the next test rather than cleaning the implementation, and a cold iterate pass at the end is not the same as refactoring with the freshly-written test as a safety net.
  • Mixed reception on the self-named branding; no technical objections to the plugin architecture or install paths.

Notable Comments

  • @conception: points to tdd-guard as the only project that enforces TDD via hooks rather than prompts, which degrade with context.
  • @dmitry_dv: “The refactor step is the silent casualty in AI-assisted TDD” – refactoring cold code after an iterate pass is structurally different from refactoring with the freshly-passing test as a live safety net.

Original | Discuss on HN