Skip to main content

AWS outage - Amazon confirms recovery after Zoom, Slack, Canva among work apps hit

AWS outage took down many major work apps

AWS re:Invent 2024
(Image: © Future / Mike Moore)

If you were struggling to connect to your work apps, you aren't alone - Amazon Web Services (AWS) suffered a major outage.

After an issue in one of its US regions overnight, users were unable to access services such as Zoom, Slack and Canva - all of which run on AWS systems - before the issue spread to the US.

Refresh

Welcome to our live coverage of this major AWS outage.

We've since seen major outages for a number of consumer-focused services, along with work-focused tools - outage tracker site DownDetector is showing the following...

A Downdetector graph showing outages relating to Amazon Web Services

(Image credit: Downdetector)

Slack is one of the hardest-hit services, with issues across the board.

Slack outage AWS issues

(Image credit: Future / Mike Moore)

According to AWS's own status page, the issue seems to stem from Amazon DynamoDB, which is the company's managed NoSQL database platform - an important building block for many customers and apps.

It's not just Slack and Zoom - DownDetector is also showing issues for other workplace tools, with Asana, Atlassian, Xero and Jora all affected (although reports do seem to be falling now)

Over at Zoom, it seems several parts of the platform have been affected, with its status page reporting several issues.

My Slack access has just totally collapsed, meaning I can't contact my team or find out what they're working on - will no-one think of the poor editors?

Outage reports are now falling from their peak at both Slack and Zoom, but it seems like issues still persist across the board.

Zoom Slack outage Aws

(Image credit: Future / Mike Moore)

Zoom Slack outage Aws

(Image credit: Future / Mike Moore)

A new update from AWS - "Oct 20 2:22 AM PDT We have applied initial mitigations and we are observing early signs of recovery for some impacted AWS Services. During this time, requests may continue to fail as we work toward full resolution. We recommend customers retry failed requests."

And just like that - another update from AWS, and it's good news for all of us wanting to get on with work.

Slack and Zoom are both still reporting issues on their respective status pages, but both have promised an update within the next 30 minutes.

DownDetector AWS outage

(Image credit: DownDetector)

Some expert insight from James Capell, our Editor on Web Hosting here at TechRadar Pro...

"The outage appears to be caused by a DNS resolution error for DynamoDB in the US-EAST-1 region. The DynamoDB database is used for many core AWS services including IAM which is used for permissions. The DNS error means that this database service cannot be accessed by the services that require it to function. Since most AWS services rely on this service somewhere in the chain we’re seeing a lot of problems."

We're not sure exactly what happened at Slack - but it's suddenly just had another major spike in outage reports.

Downdetector slack outage

(Image credit: Downdetector)

AWS has updated the severity status of the issues to "degraded" on its status page - which again could mean a solution is imminent...

Good news - AWS now thinks it has solved this issue, and services should be returning to normal very soon.

Outage reports for Slack, Zoom, Canva and Xero have all basically fallen to nothing, although the status pages for the first two are still showing some issues, so we'll stay tuned for anything happening there...

Here's a seemingly-final update from AWS - the company is pretty satisfied the issue is now over, but is still urging caution for users...

The good news is that this incident doesn't seem to have been a cyberattack - but instead, a case of AWS' own systems suffering under their own weight.

AWS is now tying up all the loose ends - its latest update notes, "We are continuing to work towards full recovery for EC2 launch errors (which may manifest as an Insufficient Capacity Error). Additionally, we continue to work toward mitigation for elevated polling delays for Lambda, specifically for Lambda Event Source Mappings for SQS."

And in more "good" news - Zoom is now showing zero issues or problems, so you can connect and chat to your heart's content!

Zoom status page

(Image credit: Zoom)

As everything now seems to be back in order, we're going to take a quick break - but we'll keep monitoring for any further updates or issues, particularly as the US comes online over the next few hours - fingers crossed this doesn't cause any more issues!

Welcome back - we're still monitoring for any knock-on effects of this morning's AWS outage as the US comes online.

Fortunately though, all key customers depending on AWS do seem to be up and running again, with Slack, Zoom, Canva and more all working as expected.

What we feared might happen...has happened.

Belay that warning - AWS has already updated to say, "We are seeing early signs of recovery for the connectivity issues and are continuing to investigate the root cause."

We may be edging closer to a solution - AWS notes, "We have identified that the issue originated from within the EC2 internal network."

More from AWS on the root causes - "We have narrowed down the source of the network connectivity issues that impacted AWS Services. The root cause is an underlying internal subsystem responsible for monitoring the health of our network load balancers. We are throttling requests for new EC2 instance launches to aid recovery and actively working on mitigations.

And another quick update - it's great to have such transparency from AWS on this.

The journey towards a fix is continuing, with another significant step forward.

Elsewhere, Amazon has released a statement on its public press page - however there's no new information as far as we can see, just a truncated summary of what's happened already today, and what the cause was.

Amazon promised a further technical update - and here it is...

Things are continuing to improve, with the latest AWS update reading as follows:

And that (as they say) should be that. AWS' latest update confirms, "We continue to observe recovery across all AWS services, and instance launches are succeeding across multiple Availability Zones in the US-EAST-1 Regions. "