A major outage in Amazon’s cloud-computing network Tuesday severely disrupted services at a wide range of U.S. companies for more than five hours, the latest sign of just how concentrated the business of keeping the internet running has become.
The incident at Amazon Web Services mostly affected the eastern U.S., but still impacted everything from airline reservations and auto dealerships to payment apps and video streaming services to Amazon’s own massive e-commerce operation. That included The Associated Press, whose publishing system was inoperable for much of the day, greatly limiting its ability to publish its news report.
Amazon has still said nothing about what, exactly, went wrong. In fact, the company limited its communications Tuesday to terse technical explanations on an AWS dashboard and a brief statement delivered via spokesperson Richard Rocha that acknowledged the outage had affected Amazon’s own warehouse and delivery operation but said the company was “working to resolve the issue as quickly as possible.”
Roughly five hours after numerous companies and other organizations began reporting issues, the company said in a post on the AWS status page that it had “mitigated” the underlying problem responsible for the outage, which it did not describe. It took some affected companies hours more to thoroughly check their systems and restart their own services.
Amazon Web Services was formerly run by Amazon CEO Andy Jassy, who succeeded founder Jeff Bezos in July. The cloud-service operation is a huge profit center for Amazon. It holds roughly a third of the $152 billion market for cloud services, according to a report by Synergy Research — a larger share than its closest rivals, Microsoft and Google, combined.
Widespread and often lengthy outages resulting from single-point failures appear increasingly common. In June, the behind-the-scenes content distributor Fastly suffered a failure that briefly took down dozens of major internet sites including CNN, The New York Times and Britain’s government home page.
Then, in October, Facebook — now known as Meta Platforms — blamed a “faulty configuration change” for an hours-long worldwide outage that took down Instagram and WhatsApp in addition to its titular platform.
This time, problems began midmorning on the U.S. East Coast, said Doug Madory, director of internet analysis at Kentik Inc, a network intelligence firm.
Customers trying to book or change trips with Delta Air Lines had trouble connecting to the airline. “Delta is working quickly to restore functionality to our AWS-supported phone lines,” said spokesperson Morgan Durrant. The airline apologized and encouraged customers to use its website or mobile app instead.
Dallas-based Southwest Airlines said it switched to West Coast servers after some airport-based systems were affected by the outage. Customers were still reporting outages to DownDetector, a popular clearinghouse for user outage reports, more than three hours after they started. Southwest spokesman Brian Parrish said there were no major disruptions to flights.
Toyota spokesman Scott Vazin said the company’s U.S. East Region for dealer services went down. The company has apps that access inventory data, monthly payment calculators, service bulletins and other items. More than 20 apps were affected.
Madory said he saw no reason to suspect nefarious activity. He said the recent cluster of major outages reflects how complex the networking industry has become. “More and more these outages end up being the product of automation and centralization of administration,” he said. “This ends up leading to outages that are hard to completely avoid due to operational complexity but are very impactful when they happen.”
It was unclear how, or whether, the outage was affecting the federal government. The U.S. Cybersecurity and Infrastructure Security Agency said in an email response to questions that it was working with Amazon “to understand any potential impacts this outage may have for federal agencies or other partners.”