AWS North Virginia data center outage – recovery to take hours

· cloud · Source ↗

An overheating failure in AWS us-east-1 knocked out EC2 capacity Thursday night, disrupting Coinbase trading and FanDuel sports-betting into Friday, with full recovery taking longer than AWS anticipated.

What Matters

  • Root cause: cooling system failure in a single Availability Zone within us-east-1 (Northern Virginia); AWS was bringing additional cooling capacity online as of 9:51 a.m. ET Friday.
  • AWS’s 3:29 p.m. ET Friday update confirmed recovery “slower than previously anticipated,” still hours away at that point.
  • FanDuel users lost bets due to inability to cash out; Coinbase reported outage of “core trading services” across multiple AWS zones.
  • Coinbase claimed multiple AZs failed; AWS stated only one AZ was affected — the discrepancy remains unresolved.
  • AWS holds ~33% of cloud infrastructure market, meaning single-region failures carry outsized blast radius across the internet.
  • [HN: @dlenski] IAM and account-level identity services for the entire non-China AWS partition are centralized in us-east-1, making true regional isolation structurally impossible.
  • [HN: @Andys] A DC with fully redundant chillers and floor-level coolers still suffered 24-hour total cooling failure when non-redundant inter-floor water pipes failed.

Original | Discuss on HN