When the cloud crashes: what the AWS outage means for UK retail
When AWS went down last week, retailers felt it immediately as websites froze, payments stalled, and delivery systems ground to a halt. The outage exposed the fragility of cloud-dependent operations, sparking a wake-up call about resilience, multi-cloud strategies, and customer trust in a hyper-connected retail world.

Register to get 1 free article
Reveal the article below by registering for our email newsletter.
Want unlimited access? View Plans
Already have an account? Sign in
At 08:09 BST on Monday 20 October, thousands of retail websites, payments systems and logistics platforms began to stutter. Shoppers couldn’t check out and delivery apps wouldn’t load. Even customer service dashboards went dark. The culprit was Amazon Web Services (AWS), the backbone of much of the world’s online infrastructure. The cloud giant suffered a large-scale outage in its US-East-1 region in Northern Virginia, one of its oldest and busiest data hubs. While the incident lasted just over six hours, its effects were global, disrupting key systems used by British retailers and service providers and exposing just how intertwined the industry has become with a handful of cloud operators.
Amazon confirmed in a detailed post-incident statement, published on 24 October, that the root cause of the outage was a fault in its internal automation system responsible for updating DNS (Domain Name System) records for Amazon DynamoDB, one of its core database services.
An automated update inadvertently created an “empty DNS record” in the US-East-1 region, effectively removing the directory that allows other AWS services to locate DynamoDB. As dependent systems failed to resolve that address, widespread service disruptions began to cascade across networks.
Amazon’s engineers initially believed the fault would self-correct, but when automated recovery tools failed, manual intervention was required. By then, large portions of dependent infrastructure – including S3 storage, EC2 instances, and authentication services – had degraded.
In its statement, AWS said: “The outage was caused by an internal process error, not an external attack. Our automated DNS update tool introduced an invalid configuration that propagated incorrectly. We have disabled the automation responsible and introduced additional validation layers to prevent similar issues.”
The company added that it is implementing a new “multi-stage verification framework” for future updates and deploying additional regional isolation boundaries designed to stop a single fault from propagating globally.
For the world’s largest cloud provider, serving millions of organisations across 190 countries, it was a rare and humbling failure.
While the incident originated in the US, the global nature of cloud dependencies meant the UK felt the shock almost immediately. Several major retailers reported payment and website issues during the Monday morning rush, a peak period for online grocery and click-and-collect orders. E-commerce sites built on AWS infrastructure, including portions of Shopify’s UK platform and Amazon-hosted components of Tesco’s and Sainsbury’s online systems, reported intermittent errors. Users attempting to log in to loyalty schemes, process online returns or access digital receipts faced timeout messages or failed authentication requests.
Payment service providers such as Stripe, Worldpay and Square also saw temporary slowdowns. Because these gateways often route traffic through AWS-hosted servers, checkout failures cascaded to retailers even when their own websites remained functional.
Parcel and courier integrations were also affected. Several logistics platforms, including ShipStation and Parcelhub, rely on AWS for real-time tracking and order updates. Their temporary downtime forced fulfilment teams to revert to manual batch processing, delaying dispatches by several hours.
Customer support platforms such as Zendesk, also built on AWS, reported partial outages, leaving retailers unable to respond to service tickets or live chats. For many multichannel retailers, this meant visibility was lost across every stage of the customer journey from browsing to purchase to after-sales service.
David Jinks M.I.L.T., head of consumer research at ParcelHero, says the incident highlights how invisible infrastructure failures can paralyse frontline operations. “The average consumer sees a spinning wheel on a checkout page and assumes it’s a website problem, but what we witnessed was the collapse of an entire digital supply chain,” he explains. “Retailers discovered that the smooth functioning of payments, tracking, and even customer communication depends on a single data region thousands of miles away.”
Downdetector recorded more than eight million reports of service degradation worldwide over the course of the day, and analysts estimate more than 2,000 major companies were directly affected. Parcelhero said the outage could cost businesses and retailers billions in lost revenue and service disruption, drawing parallels to last year’s CrowdStrike incident, which caused $5.4bn (£4.05bn) in losses for Fortune 500 firms.
For UK retailers, the cost is not only transactional. The outage has reignited debate about how prepared UK retailers really are for cloud disruption. While most large retailers claim to operate redundant systems, few have tested failovers that extend across cloud providers or regions.
Jake Madders, co-founder of Hyve Managed Hosting, says the incident demonstrates the limits of even the most trusted providers. “AWS has built an incredible reputation for uptime, but the reality is that no system is immune. Retailers have to think in terms of distributed risk. Hosting critical workloads in multiple regions, or even across multiple clouds, should no longer be considered a luxury – it’s basic resilience planning.”
Madders adds that while larger enterprises can afford multi-cloud strategies, smaller retailers remain vulnerable: “For SMEs that depend on AWS via third-party SaaS platforms – like e-commerce plugins or analytics dashboards – recovery can be painfully slow. Those businesses often don’t have direct control over where their data is hosted or how quickly it can be restored.”
Edvards Margevics, co-CEO of payments infrastructure firm Concryt, says payment providers are also reassessing their own redundancy models: “Even a few minutes of downtime can create settlement backlogs and reconciliation errors. What this outage showed is that automation can introduce new failure points if validation isn’t watertight. We’re now reviewing all our internal dependencies and re-prioritising geographic distribution of data to avoid cross-regional cascading.”
Likewise, security leaders have warned of opportunistic threats during downtime. Vonny Gamot, head of EMEA at McAfee, notes that “when trusted services like AWS go offline, criminals seize the moment”.
“We saw a surge in phishing campaigns mimicking retailer ‘support’ messages, urging users to re-enter payment details,” Gamot says. “Retailers must treat every outage not just as a technical risk, but a customer-trust and cybersecurity event.”
The incident has underscored a truth that many retailers were reluctant to confront: outsourcing infrastructure does not outsource responsibility. Cloud reliability is shared and retailers must understand what sits within their control.
In the wake of the outage, cloud-sovereignty debates have resurfaced across Europe. Regulators and trade bodies are pressing for clearer accountability when overseas infrastructure failures affect domestic operations. The UK’s Department for Science, Innovation and Technology (DSIT) has indicated that cloud resilience and systemic dependency will form part of its 2026 Digital Infrastructure Review.
Retailers, meanwhile, are beginning to act pre-emptively. Several large chains are reportedly exploring “multi-region mirrored hosting” across AWS Europe (London) and AWS Europe (Frankfurt), while others are evaluating hybrid deployments that integrate Microsoft Azure or Google Cloud for failover capacity.
For Margevics, this represents a shift in mindset. “Cloud diversification used to be a technical consideration. It’s now a board-level risk conversation,” he explains. “CFOs and CIOs alike are asking not just, ‘Is our site online?’ but, ‘Can we still take payments and fulfil orders if our primary provider disappears for six hours?’”
Beyond redundancy, retailers are re-evaluating the concept of edge computing, which means processing more data locally, closer to where transactions occur. By reducing dependence on centralised data centres, edge networks can maintain key functions such as payment authorisation or inventory updates even when cloud connectivity drops.
This October incident is unlikely to dent long-term customer loyalty for Amazon, but it has prompted difficult questions about the concentration of critical infrastructure. Analysts note that AWS still commands roughly one-third of the global cloud market, which is more than its next two competitors combined.
Cloud computing has become the nervous system of commerce, and when that system falters, the symptoms are immediate for retailers who rely on them: empty baskets, delayed deliveries, and frustrated customers.
“For organisations that prioritise data sovereignty, it should also be a key consideration, with local failover options and replication to trusted jurisdictions built into their continuity strategy,” Madders advises. “Effective mitigation also includes regular backup and recovery testing, automated failover processes, and a well-documented, frequently reviewed incident response plan.”
AWS may have restored its servers within hours, but the wider retail industry will take far longer to restore confidence in the cloud’s infallibility. The October outage was a stress test of modern retail’s digital foundations, one many discovered were thinner than they thought.





