Railway Is Having a Major Outage

· cloud · Source ↗

TLDR

  • Google Cloud blocked Railway’s account on May 19, taking down its dashboard, API, edge network, and all customer workloads with no ETA for restoration.

Key Takeaways

  • Errors reported: “no healthy upstream”, “unconditional drop overload”, login failures, dashboard inaccessible.
  • Root cause: Google Cloud account block, not a Railway infrastructure fault internally – access was restored to upstream but workloads remained offline.
  • Railway’s dashboard, API, and internal network control plane all depend on the same GCP infrastructure with no apparent fallback.
  • As of May 20 01:23 UTC, the team was evaluating “alternative paths” to restore services while in direct contact with Google Cloud support.

Hacker News Comment Review

  • Commenters flagged a contradiction: Railway’s own blog claimed “you can’t build a cloud on another cloud” and touted running their own metal, yet the outage exposed full GCP dependency.
  • There is split opinion on fault assignment – some blame Google’s account-blocking pattern (also seen with a Korean government org), others argue Railway should have architected blast-radius isolation so one provider block cannot kill the entire control plane.
  • Trust damage was immediate and concrete: multiple commenters who were active or prospective Railway customers said the outage changed their evaluation, not just because of the outage itself but because of the total-service failure mode it revealed.

Notable Comments

  • @Avicebron: References a prior Railway incident involving backup keys stored in the prod database – pattern of single-point-of-failure decisions.
  • @miniman1337: Quotes Railway blog verbatim: “you can’t build a cloud on another cloud” – direct contradiction of today’s GCP-only failure.

Original | Discuss on HN