opinion I've been watching AWS explain away outages for the better part of a decade. And this is hard!
The most common thing computers do is break, and being forthcoming and transparent about that reality while not making your platform sound like an incoherent pile of bricks teetering on a cliff above a playground is a delicate balancing act. AWS's reliability is the stuff of legend, and on the rare occasion that it fails, they walk the messaging tightrope very well. So I was surprised to learn all you have to do to sweep away twenty years of excellence and make them sound like frothing insecure zealots is sprinkle a bit of "perhaps AWS is bad at AI" narrative on it. Then, they lose their minds.
The first sign that AWS has once again blown a gasket is when they post a defensive blog post strongly insinuating that a journalist at a major publication is an idiot. (I hope to one day receive one myself; perhaps I need to get louder?) Sure enough, a post titled Correcting the Financial Times report about AWS, Kiro, and AI, which does NOT show up on their list of blog posts, strikes that blended Amazon tone of "salty, defensive, and more than a little insulting."
Let's back up
Let me set the scene: Kiro launched in July 2025 as Amazon's answer to the agentic AI coding tools flooding the market. And to its credit, for a month or two it was great! It had a "spec" approach that was less "ready, fire, aim" than most other tooling at that point, and due to its surprise popularity, it was very hard to get access to it. As an aside, it hasn't changed much since then, and has been lapped by a number of competitors.
Gathering Reddit posts, comments on the incident, and AWS's weirdly defensive blog post, it seems that what happened is that someone was using the tool; it fired off a CloudFormation teardown-and-replace (which is what CloudFormation often does, because … CloudFormation) while the user was mistakenly in a production environment.
Whoops.
This took down Cost Explorer in the Mainland China partition. AWS goes to great pains to highlight that it was "only one of their 39 geographic regions" without mentioning that Cost Explorer is only deployed in one of those regions per partition (in this case, the partition was "Mainland China"). We have all been there. Let the engineer who has never experienced the "wait, am I in production?" sinking sense of dread cast the first stone.
But Amazon's official response reads like a hostage note written by someone protecting their captor. The incident was a "coincidence that AI tools were involved." The same issue could occur with "any developer tool." The engineer involved had "broader permissions than expected." Yes! This is what the messy business of building things looks like, and it's pretty clear that some controls were skipped in this case. Claude Code periodically likes to do that in my test environment as well, and is only hampered by the grim reality that even after being trained on the sum total of human knowledge, it still can't figure out how the hell the AWS CLI parameters and arguments work together. Neither can I. This is probably fine.
But think about this "fists of ham" communications strategy for a second! Their AI is implicated in deleting production infrastructure, and their crisis communications team's first instinct was to find a human and hurl them under the closest bus. The tweet practically writes itself: "AI deletes the database. AWS spokesperson arrives at press conference: It was me! I did it! Don't blame the AI, I'm just incompetent!"
This isn't a coverup; it's a massive insecurity that's extremely cringey to witness. AWS would rather have the world believe their engineers are incompetent than admit their artificial intelligence made a mistake. That's not just a messaging choice. That's a company so desperate not to look behind in the AI race that they'd torch their own employees' reputations to protect their robot's feelings. What does it say about AWS's strategic position that defending the AI's reputation takes priority over protecting their humans? When did "don't hurt the algorithm's feelings" become corporate policy?
I challenge anyone to cite the Amazonian "Strive to be the Earth's Best Employer" leadership principle without sounding sarcastic. You can't; it's impossible.
The post-incident fix their blog post alludes to? Mandatory peer reviews for AI-generated changes. Let me translate: the solution to "AI made unsupervised changes that broke everything" is "add a human to supervise." The same humans they're laying off by the thousands. The same humans they'll throw under the bus when the next AI incident happens.
- Amazon's $200 billion capex plan: How I learned to stop worrying and love negative free cash flow
- Amazon can't build AI capacity fast enough, throws another $200B at the problem
- Amazon CEO Andy Jassy goes wobbly on AI bubble possibility
- Amazon is forging a walled garden for enterprise AI
What actually matters
Things break. Code has bugs. AI will make mistakes. This is the natural order of building complex systems, and anyone who's been in this business longer than a funding cycle understands that. The problem isn't that Kiro decided production was due for a surprise deletion. The problem is that when faced with their first major AI failure, AWS's instinct wasn't transparency or accountability. It was to protect the AI's reputation at all costs.
If your cloud provider would rather look incompetent than admit its AI is fallible, sit with that for a second. Not because this particular outage was the end of the world. It wasn't. It's Cost Explorer, for God's sake; I spend meaningful chunks of my life with that service, and it being down for a few hours just means I'll do something else for a bit. But we are at the exact moment where every cloud vendor is asking you to hand agentic AI the keys to your production environment. When the first real test case showed up, AWS's communications instinct was to protect the robot and throw the human under the bus.
AWS will figure out AI eventually. They always do, even if "eventually" means half a decade of the community screaming into the void first. But they won't get there by pretending their tools can't make mistakes, and they definitely won't get there by publicly kneecapping their own engineers every time one does.
The company that built its empire on "everything fails all the time" has apparently found the one thing it refuses to let fail: the narrative that it's good at AI. ®