Rust-based control plane for warehouse pipelines adding branches, column-level lineage, compile-time data contracts, and per-model cost attribution on top of existing Databricks or Snowflake stacks.
Key Takeaways
Runs as a 20-crate Cargo workspace CLI; playground uses local DuckDB with no credentials, installable in 60 seconds.
Compile-time contract enforcement catches missing required columns, protected column removal, and unsafe type changes before any rows are written (diagnostic codes E010, E013).
Named branches create isolated schemas for experiments; column-level lineage shows downstream blast radius before promotion, not after.
Schema drift detection diffs source vs. target on each run and recreates the target table on type change, blocking silent corruption.
AI model generation includes a compile-validate retry loop: Rocky generates DSL, compiles, and auto-retries on parse failure.
Hacker News Comment Review
The compile-time lineage angle drew the most interest: commenters see it as fundamentally different from log-archaeology tools like OpenLineage, particularly for refactors and masking policy changes.
Positioning against Databricks drew pushback – Databricks owns its own DAG via Jobs and Pipelines, making the “your stack can’t own the DAG” framing inaccurate for a significant portion of the target audience.
The dbt team showed up: Anders from dbt flagged that dbt-fusion (Rust-based, going GA imminently) covers overlapping ground, and noted dbt’s own roadmap includes branching and budgeting; the competitive landscape is narrowing fast.
Notable Comments
@Xiaoher-C: Asks whether Rocky exposes a “lineage diff” between branches to show downstream impact of a PR before merge – a concrete feature gap worth watching.
@jtagliabuetooso: Points to a 2023 paper covering native branches, immutability, and lineage, suggesting the “distinctive” claims need tighter scoping against prior art.
@ramon156: “If your introduction message already includes a bunch of uncurated claims and LLM smells” – flags README quality as a trust signal for the codebase itself.