Show HN: Rocky – Rust SQL engine with branches, replay, column lineage

· ai systems · Source ↗

TLDR

  • Rust-based control plane for warehouse pipelines adding branches, column-level lineage, compile-time data contracts, and per-model cost attribution on top of existing Databricks or Snowflake stacks.

Key Takeaways

  • Runs as a 20-crate Cargo workspace CLI; playground uses local DuckDB with no credentials, installable in 60 seconds.
  • Compile-time contract enforcement catches missing required columns, protected column removal, and unsafe type changes before any rows are written (diagnostic codes E010, E013).
  • Named branches create isolated schemas for experiments; column-level lineage shows downstream blast radius before promotion, not after.
  • Schema drift detection diffs source vs. target on each run and recreates the target table on type change, blocking silent corruption.
  • AI model generation includes a compile-validate retry loop: Rocky generates DSL, compiles, and auto-retries on parse failure.

Hacker News Comment Review

  • The compile-time lineage angle drew the most interest: commenters see it as fundamentally different from log-archaeology tools like OpenLineage, particularly for refactors and masking policy changes.
  • Positioning against Databricks drew pushback – Databricks owns its own DAG via Jobs and Pipelines, making the “your stack can’t own the DAG” framing inaccurate for a significant portion of the target audience.
  • The dbt team showed up: Anders from dbt flagged that dbt-fusion (Rust-based, going GA imminently) covers overlapping ground, and noted dbt’s own roadmap includes branching and budgeting; the competitive landscape is narrowing fast.

Notable Comments

  • @Xiaoher-C: Asks whether Rocky exposes a “lineage diff” between branches to show downstream impact of a PR before merge – a concrete feature gap worth watching.
  • @jtagliabuetooso: Points to a 2023 paper covering native branches, immutability, and lineage, suggesting the “distinctive” claims need tighter scoping against prior art.
  • @ramon156: “If your introduction message already includes a bunch of uncurated claims and LLM smells” – flags README quality as a trust signal for the codebase itself.

Original | Discuss on HN