Stay strong, @awscloud friendos. As outages go, this one is pretty mild / funny. Customers are hurting but not THAT much. And it's not like I have anything else to do since billing single-tracks through us-east-1.
Wait, how can this be? The Status Page was very clear that it was just the management console that was impacted...
An Amazon server outage caused problems for Alexa, Ring, Disney Plus, and deliveries
Your Amazon delivery might be delayed.
Okay, we're into yikes territory now.
See, the trouble with being @awscloud is let's pretend you use a third party for all of this stuff. Do *THEY* or *their vendors* rely on AWS for anything? Are you SURE?…
"affecting our incident response tooling" Is this that old adage where you don't host your status page on your core infrastructure?
I think you need a physical fallback for these eventualities.
It's no use, it's AWS all the way down.
Each cloud vendor should run their status page on another cloud vendor? Seems like one way to be confident there isn't any cross-dependency. It also feels ridiculous, but here we are.
Ha, the M.A.D. of status pages.
that would be a surreal contractual term to see coming from AWS... "Your service must not rely in any way, directly or indirectly, on AWS".
Also, isn’t reliability one of their “Pillars of Excellence”? Yet the AWS console login page for the entire planet lands in US-EAST-1?
Well, at leat I don't rely on those Microsoft losers for anything critical! [Checks github account that contains all my source code.]
This bit me today. Our notifications to our customers, which we sent throughout the day, all got sent as the outage was remediated. 🤦
It's not like us Australians *need* sleep. We'll get by on spite alone.
The ole S3 outage with the status check marks stored on S3 🤣
Anecdotally, setting up Amazon returns is broken.
pouring one out now for the poor bastards who'll have to write up that COE
My roomba is having issues.🤖
That's on you. Making your home smart without any local/offline backup (for example having Zigbee remotes for your IoT lightbulbs) is plain stupid.
Very fun way of finding out @McDonalds Usa app runs on AWS
yikes is right. the post mortem is going to be rather interesting.
Sucks more that it shows Console issues in all regions - painful if you're in Brazil or Middle-East and can't login to do things due to US-East.
We've been in yikes territory for 3 hours now.
Impact is increasing
you've been in ops so you should know that getting insight in the impact of a large scale outage is not an easy task
True, but The Verge is picking this up at the same pace as the AWS Status Dashboard ;-)
Yes but Amazon has had many outages and should know that its status page is always filled with lies. If they cared at all about it, they would have a different treatment.
from the article "some of the Amazon Web Services cloud servers"... so they just need to reboot a couple of servers, right? 😁
If it was me, I would simply check whether each service was working for most people
Organizations is also not working
Wonder how those mainframe migrations are going
EKS cannot pull images from ECR and they have not acknowledged this yet
The initial incident report misled us as we thought it was our fault
Yeah it's definitely not just that. Dynamo seems to be entirely down and a couple of our services are impacted but without access to cloudwatch we don't know exactly what's failing.
It was way more than that. EC2 was badly affected…hence a ton of their services wouldn’t function correctly.
Hopefully it doesn't AWS Snowball into something worse.
Ahhhh... I see what you did there! :D
I’m not sure I’d agree with you. Amazon’s shopping cart appears to be showing errors too…
It's not that mild/small. Pretty impactful. :(
This is a major outage #awsdown. Customers are deeply impacted.
Everything is a minor outage when your status page says everything is fine!
dynamodb is not working, it is not just the console
All day, couldn't get into Workspace. Our accounts are configured so they're ONLY accessible from Workspace. Even once could login to Workspace, couldn't fetch STS tokens. :(
“Mild/funny” hmm? Spoken like someone not trying to get white noise to play in the baby’s room for nap time 😛
With Connect down in the region, all our call centers are completely down.
I knew it was a good idea to not get out of bed this morning
We’ll be using Blind to communicate for outage resolution before much longer
The only time choosing Ohio pays off
Assuming you're not using any services with hidden dependencies on us-east-1.
I fixed that for you: assuming you're not using any services
Wait, is that true? Is that on record somewhere? 🤨
us-east-1 being down is like a snow day for programmers.
Today's status report: "day lost due to AWS outage". Wife asked if this meant I couldn't bill for today.
For programmers, a Github outage is a snow day. They're mostly unaffected by cloud outages unless they're also on call.
You can bet your sweet NAT gateway not a single billing metric will be lost.
Local news thread says delivery drivers and warehouse are stopped too.
Whats fascinating about an AWS outage, rare that they are — is that you hear about it immediately. Brings out all the vendors and analysts — see, see told ya. Other clouds fail, not a peep. Begs question why are so many internet critical apps only designed for one AZ?
Today appears to have transcended AZs and that’s scary.
AZ's share the same network, and as the network was the issue, it affects an entire region. More scary to me: the fact that Amazon itself was impacted; apparently Amazon truck drivers were unable to deliver packets (you'd assume Amazon follows AWS best practices).
There’s multiple AZs in a region, but I guess that’s not good enough if a regional outage is possible.
Time for them to consider multi-cloud ;-)
I can’t listen to my music, Corey.