Lessons From the AWS Outage: Don’t Put All Your Eggs In One Basket

October 23, 2025

Amazon says that the infamous AWS outage that took place for about 12 hours on October 20 and caused disruptions for thousands of businesses is now fully remediated, but organizations are still struggling to catch up with some of the backups and issues it caused.

Amazon says that the infamous AWS outage of earlier in the week is now fully remediated, but organizations are still struggling to catch up with some of the backups and issues it caused. More importantly, it should serve as a warning about putting critical services in the hands of just one cloud provider without an adequate contingency plan.

Roughly 12-hour AWS outage hit thousands of major businesses & government organizations

If you somehow managed to go untouched by the globe-spanning incident, the AWS outage took place for about 12 hours on October 20 and caused disruptions for thousands of businesses relying on Amazon’s cloud services. This included fellow tech titan Google, several international airlines, an assortment of online financial services, the apps of telecommunications providers, and even government offices and court systems among others.

The issue traces back to Amazon’s oldest and largest web services cluster: US-EAST-1, located in Virginia. While half a day may not seem like a massively damaging outage, the total cost of the incident is extremely likely to be in the billions of dollars due to the number of major businesses and organizations that rely on this cluster; it is the default for Amazon’s web services and likely the most heavily used on a daily basis.

The costs will come in business disruption, lost productivity, and making things right with upset customers. There may also be legal costs. Delta and United both reported flight delays as a result of the AWS outage, schools saw classes delayed as popular teaching platform Canvas went down, even Amazon’s own Prime service experienced shipping delays. One of the most critical outages was seen in the California court systems, where two of the most widely-used case filing platforms had their upload tools and phone systems go down during work hours.

AWS outage not the first, demonstrates need for resiliency measures

The AWS outage is very reminiscent of the similar Crowdstrike incident of a little over a year ago, though the circumstances of the two cases are different. Amazon has not released much detail yet but it seems to have been some sort of catastrophic failure of load balancing systems, while Crowdstrike pushed a broken security update that in turn broke client systems. But this is also far from the first AWS outage to have serious impact on global affairs, with at least a dozen recorded going back to the launch of the service nearly 20 years ago.

Organizations have to plan not just for security breaches, but for simple outages like this that can take out critical portions of their business and end up racking up a substantial bill. As the AWS outage demonstrates, it does not necessarily take a complex technical issue to cause this level of chaos. The simplest and most direct lesson is that organizations need to have reasonable redundant systems in place, and an updated disaster incident plan that makes sense in an uncontrollable extended outage scenario in which a full business day or more is lost.

These plans also cannot be centered on an expected quick recovery time. 12 hours may have seemed like a relatively long outage compared to usual cloud uptimes, but consider that this was the well-resourced Amazon addressing the issue; a smaller and less resourced player, which is essentially all of them in the cloud space as compared to Amazon, could very well take days rather than hours to fully recover from some similar issue. This means not just regular backups, but considering a multi-cloud plan to ensure that a secondary system of some sort is available when catastrophe strikes.