Redis array type: short story of a long development

· ai coding · Source ↗

TLDR

  • antirez built a new Redis Array data type over four months, using AI (Opus, GPT 5.x, Codex) for spec writing, code generation, and testing at production quality.

Key Takeaways

  • The Array type uses a three-tier structure: sparse representation, directory+slices, and a super-directory of sliced dense directories (4096 elements/slice) to handle large non-contiguous indices without huge allocations.
  • ARSCAN and ARPOP scan in time proportional to existing elements, not range span, which is the key performance property distinguishing this from ZSET.
  • ARGREP adds regex search using TRE (with antirez-patched OR-pattern optimization and security fixes) after antirez started storing markdown files in Redis arrays as a knowledge base.
  • AI provided leverage on two specific tasks: exhausting boilerplate like 32-bit support and fuzz-style coverage of complicated algorithm edge cases. Core design decisions remained human-driven.
  • The full implementation is ~5000 lines (2000 sparse array, 2000 command layer, ~500 AOF/RDB); PR is open at github.com/redis/redis/pull/15162.

Hacker News Comment Review

  • Commenters debated whether Array overlaps too much with ZSET; the distinction is that numerical index is semantic in Array, and dense/sparse auto-promotion is built-in rather than bolted onto ZSET internals.
  • The AI workflow antirez describes – spec-first, line-by-line review, AI as safety net – matches what several senior practitioners reported independently, pushing back on “vibe coding” framing.
  • Skepticism exists that the AI praise is partly Redis marketing for the vector/AI-use-case market, and that the 22k-line PR size makes community review difficult compared to incremental mailing-list development (Postgres model).

Notable Comments

  • @tibbar: Flags that a 22,000-line PR with complex feature set is hard to review; contrasts with Postgres incremental mailing-list approach.
  • @antirez: Clarifies actual code is ~5000 lines; the rest is tests, JSON descriptors, and the TRE dependency.

Original | Discuss on HN