Convopage : @QuinnyPig : Stay strong, @awscloud friendos. As outages go, this one is pretty mild / funny. Customers are hurting but not THAT much. And it's not like I have anything else to do since billing single-tracks through us-east-1.

Convopage

See the entire conversation

Stay strong, @awscloud friendos. As outages go, this one is pretty mild / funny. Customers are hurting but not THAT much. And it's not like I have anything else to do since billing single-tracks through us-east-1.

67 replies and sub-replies as of Dec 08 2021

Corey Quinn@QuinnyPig

Wait, how can this be? The Status Page was very clear that it was just the management console that was impacted...

An Amazon server outage caused problems for Alexa, Ring, Disney Plus, and deliveries

Your Amazon delivery might be delayed.

theverge.com

Corey Quinn@QuinnyPig

Okay, we're into yikes territory now.

Corey Quinn@QuinnyPig

See, the trouble with being @awscloud is let's pretend you use a third party for all of this stuff. Do *THEY* or *their vendors* rely on AWS for anything? Are you SURE? twitter.com/magicaltrout/s…

Tom Barber - Div 174 Massive@magicaltrout

"affecting our incident response tooling" Is this that old adage where you don't host your status page on your core infrastructure?

Russell Howe@rhowe212

I think you need a physical fallback for these eventualities.

Jamie@91jme

It's no use, it's AWS all the way down.

Tom Barber - Div 174 Massive@magicaltrout

Richard Seroter@rseroter

Each cloud vendor should run their status page on another cloud vendor? Seems like one way to be confident there isn't any cross-dependency. It also feels ridiculous, but here we are.

Aashish Koirala@aashishkoirala

Ha, the M.A.D. of status pages.

Matt Palmer@tobermatt

that would be a surreal contractual term to see coming from AWS... "Your service must not rely in any way, directly or indirectly, on AWS".

Tim Harkin@harkin_tim

Also, isn’t reliability one of their “Pillars of Excellence”? Yet the AWS console login page for the entire planet lands in US-EAST-1?

Michael Gat@michaelgat

Well, at leat I don't rely on those Microsoft losers for anything critical! [Checks github account that contains all my source code.]

Karl Katzke@kkatzke

This bit me today. Our status.io notifications to our customers, which we sent throughout the day, all got sent as the outage was remediated. 🤦

Michael Pearson@mipearson

It's not like us Australians *need* sleep. We'll get by on spite alone.

Sean Coates@coates

mb@BrawnVivant

lol

Tom Barber - Div 174 Massive@magicaltrout

"affecting our incident response tooling" Is this that old adage where you don't host your status page on your core infrastructure?

Tristan Payne@aka_tpayne

The ole S3 outage with the status check marks stored on S3 🤣

Look up Knuffelberen@poiThePoi

Anecdotally, setting up Amazon returns is broken.

Mike Jackson@volkadav

pouring one out now for the poor bastards who'll have to write up that COE

James@my_fugue_state

My roomba is having issues.🤖

Abhay Shah@_abhayshah

I CANNOT TURN ON A FUCKING LIGHTIN MY HOME AND SHITS GONNA GET REAL IF WE CAN'T TURN ON OUR CHRISTMAS TREE IN A FEW HOURS

Korporátní správce sítě@WillCisco4Food

That's on you. Making your home smart without any local/offline backup (for example having Zigbee remotes for your IoT lightbulbs) is plain stupid.

CommunistJack@CommunistJack

Very fun way of finding out @McDonalds Usa app runs on AWS

Abhay Shah@_abhayshah

yikes is right. the post mortem is going to be rather interesting.

Steve Mushero@stevemushero

Sucks more that it shows Console issues in all regions - painful if you're in Brazil or Middle-East and can't login to do things due to US-East.

Geoff H@SQLCatt

We've been in yikes territory for 3 hours now.

David Chayer@david_chayer

Impact is increasing

Ricardo Kustner@rkustner

you've been in ops so you should know that getting insight in the impact of a large scale outage is not an easy task

marek kuczyński@marekq

True, but The Verge is picking this up at the same pace as the AWS Status Dashboard ;-)

Joe Emison@JoeEmison

Yes but Amazon has had many outages and should know that its status page is always filled with lies. If they cared at all about it, they would have a different treatment.

Ricardo Kustner@rkustner

from the article "some of the Amazon Web Services cloud servers"... so they just need to reboot a couple of servers, right? 😁

chamomillionaire@jgoldschrafe

If it was me, I would simply check whether each service was working for most people

DAKN@DAKNHH

Organizations is also not working

Martin Price@pull_gs

Wonder how those mainframe migrations are going

hugoShaka 💉💉@Hugo_Shaka

EKS cannot pull images from ECR and they have not acknowledged this yet

hugoShaka 💉💉@Hugo_Shaka

The initial incident report misled us as we thought it was our fault

mb@BrawnVivant

Yeah it's definitely not just that. Dynamo seems to be entirely down and a couple of our services are impacted but without access to cloudwatch we don't know exactly what's failing.

David E. Cross@dcrosstech

🤣

RJ@DickJim3

It was way more than that. EC2 was badly affected…hence a ton of their services wouldn’t function correctly.

Kevin Boyd@Beryllium9

Hopefully it doesn't AWS Snowball into something worse.

shindig99@brufkaki

Ahhhh... I see what you did there! :D

John Tipper@john_tipper

I’m not sure I’d agree with you. Amazon’s shopping cart appears to be showing errors too…

Ray Terrill@Rayterrill

It's not that mild/small. Pretty impactful. :(

ok.thisisfine.ok@brahimdagher

This is a major outage #awsdown. Customers are deeply impacted.

mb@BrawnVivant

Everything is a minor outage when your status page says everything is fine!

Tony Benbrahim@tonybenbrahim

dynamodb is not working, it is not just the console

Thomas Jones@ferricoxide

All day, couldn't get into Workspace. Our accounts are configured so they're ONLY accessible from Workspace. Even once could login to Workspace, couldn't fetch STS tokens. :(

Bethany Quinn 💉💉💉@bequinning

“Mild/funny” hmm? Spoken like someone not trying to get white noise to play in the baby’s room for nap time 😛

Jesse Gardner@jesse_gardner

With Connect down in the region, all our call centers are completely down.

Diego Rivera@diriver63

Same

Vincent Janelle@randomfrequency

I knew it was a good idea to not get out of bed this morning

bw2tdrpj@bw2tdrpj

We’ll be using Blind to communicate for outage resolution before much longer

Definitely not a musician - My Cat@acedrew

The only time choosing Ohio pays off

Thomas Jones@ferricoxide

Assuming you're not using any services with hidden dependencies on us-east-1.

Definitely not a musician - My Cat@acedrew

I fixed that for you: assuming you're not using any services

Vikram Pillai@vikramkpillai

Wait, is that true? Is that on record somewhere? 🤨

arthur johnston@the_ajohnston

us-east-1 being down is like a snow day for programmers.

Thomas Jones@ferricoxide

Today's status report: "day lost due to AWS outage". Wife asked if this meant I couldn't bill for today.

Brad Walker@BradWalker743

For programmers, a Github outage is a snow day. They're mostly unaffected by cloud outages unless they're also on call.

Tim Yocum@tkyocum

You can bet your sweet NAT gateway not a single billing metric will be lost.

Josh Bartley@joshabartley

Local news thread says delivery drivers and warehouse are stopped too.

Michael Liebow@cloudDay_2

Whats fascinating about an AWS outage, rare that they are — is that you hear about it immediately. Brings out all the vendors and analysts — see, see told ya. Other clouds fail, not a peep. Begs question why are so many internet critical apps only designed for one AZ?

Corey Quinn@QuinnyPig

Today appears to have transcended AZs and that’s scary.

Lukas Tribus@LukasTribus

AZ's share the same network, and as the network was the issue, it affects an entire region. More scary to me: the fact that Amazon itself was impacted; apparently Amazon truck drivers were unable to deliver packets (you'd assume Amazon follows AWS best practices).

RichWhy@RichWhy

There’s multiple AZs in a region, but I guess that’s not good enough if a regional outage is possible.

Brian Clark@deepthoughts10

Time for them to consider multi-cloud ;-)

Adam Roach@adambroach

I can’t listen to my music, Corey.