Google Cloud incorrectly suspended Railway’s production GCP account on May 19, 2026, causing an ~8-hour platform-wide outage across GCP, Railway Metal, and AWS.
Key Takeaways
GCP automated action suspended Railway’s account without proactive notice, taking down the dashboard, API, control plane, and all GCP-hosted compute.
Railway’s edge proxies cache routing tables from a GCP-hosted control plane; once caches expired, Metal and AWS workloads also returned 404s despite remaining online.
Recovery was staged and slow: persistent disks, compute instances, and networking each required separate restoration, extending the outage several hours past account reactivation.
GitHub rate-limited Railway’s OAuth and webhook integrations during recovery due to burst retry volume, blocking logins and builds as a secondary cascade.
Immediate fixes: decouple route discoverability from GCP control plane to form a true mesh, spread HA database shards across AWS and Metal, and move GCP off the data plane hot path.