Branimir Lambov from IBM on Cassandra

· databases · Source ↗

TLDR

  • Cassandra committer Branimir Lambov traces a decade of trie-based storage work: BTI format, trie memtables in Cassandra 5, and the ongoing CEP-57 generalization.

Key Takeaways

  • The trie memtable and BTI SSTable format both landed first in closed-source DSE 6 (2017), then contributed to Apache Cassandra 5 (2024) – a five-year gap.
  • Unified Compaction Strategy in Cassandra 5 handles data densities an order of magnitude higher than legacy strategies and reduces manual tuning.
  • A compaction parallelization bug caused data loss in production: an assertion was disabled in a release build, showing that correctness checks you can switch off offer false confidence.
  • Accord will bring full cross-partition ACID transactions to Cassandra at performance that scales like eventually-consistent operations – not yet merged but theoretically validated.
  • Java 11/17 support constrains the main codebase; subprojects like JVector already use Java 21 vector APIs, signaling where performance work is heading.

Hacker News Comment Review

  • Commenters focused on LLM skepticism: Lambov’s “industrious fool” framing resonated, but others flagged that any specific capability claim about LLMs is stale by publication time, creating a no-win interview dynamic.
  • The Accord/ACID announcement drew genuine interest as a potential shift in where Cassandra fits architecturally – commenters want implementation details.
  • A recurring thread: Cassandra and similar specialized databases get abused by practitioners who reach for them without understanding the consistency and partition tradeoffs they were designed around.

Notable Comments

  • @jbellis: peer signal from inside the ecosystem – “Branimir is an engineer’s engineer.”
  • @dzonga: warns of a generation of practitioners who won’t understand tools like Cassandra or FoundationDB, “used with precision by missionaries” but abused by resume-driven hiring.

Original | Discuss on HN