Amazon Broke Its Own Site with AI Code

This Is What Your Team Should Fix Before It Happens to You.

Amazon had a six-hour outage on March 5. Checkout down. Prices not loading. Orders disappearing. Tens of thousands of users hitting the wall at the exact moment they tried to buy something.

The cause, according to its own internal memo: “a defective code deployment.” The context, according to the Financial Times: a pattern of incidents with a “high blast radius” linked to “GenAI-assisted changes.”

This is not a story about how dangerous AI is. It is a story about what happens when adoption outpaces governance — and it is a story that could repeat itself at your company if you are not paying attention.


What Actually Happened at Amazon

Amazon’s ecommerce outage was the most visible incident, but it was not the first. According to internal documents obtained by the FT, there was an “incident trend” over months — including two separate AWS outages in late 2025, where engineers allowed Kiro, Amazon’s AI coding tool, to make changes without oversight. In one case, the tool deleted and recreated an entire AWS environment, causing a 13-hour outage in a China region.

Dave Treadwell, SVP of Amazon ecommerce — the same executive who in November 2025 ordered 80% weekly use of Kiro across all engineering teams — called engineers to a normally optional meeting and made it mandatory. The briefing note described the problems as “novel use of GenAI for which best practices and safeguards are not yet fully established.”

The announced measure: junior and mid-level engineers must get approval from a senior engineer on any AI-assisted code change before it reaches production.

Amazon publicly nuanced some of the details. A spokesperson said the meeting was “part of normal business,” that only one incident was AI-related, and that no AI-written code was involved. Internal documents suggest another story.


The Real Problem Is Not AI. It Is Speed Asymmetry.

The most important observation from this entire situation, and it is buried in a dev community post, is this:

“The problem is not that AI writes broken code. It is that AI writes code that seems plausible and passes a quick review.”

That is it. That is the entire problem.

When a developer writes code, the pace is slow enough that review processes — pull requests, code review, staging, QA — can more or less keep up. When an AI assistant generates code 10 times faster, or when you give an agentic tool broad access to your production systems, existing review processes become the bottleneck. And teams bypass them.

Amazon’s existing protocol for Kiro required two-person approval for production changes. In practice, that safeguard was bypassed or not enforced. The tool moved faster than the process.


The Problem with the Kiro Mandate

There is a second layer here worth naming directly: what happens when adoption is dictated from above without the governance infrastructure to sustain it.

Treadwell pushed for 80% weekly use of Kiro across engineering teams. Approximately 1,500 engineers protested through internal forums, arguing that external tools like Claude Code performed better on key tasks like multi-language refactoring. The mandate went through anyway.

When you mandate a tool before the team has established best practices, before safeguards are built, before there is shared understanding of what the tool should and should not touch — you are creating exactly the conditions for high-impact incidents.

The briefing note says it clearly: “novel use of GenAI for which best practices and safeguards are not yet fully established.” That is not an AI problem. It is a process problem. Amazon built the speed before building the controls.


What Every Development Team Should Review Right Now

This is not a cautionary tale about not using AI coding tools. I use them. You probably do too. The question is whether your team has thought through the governance layer.

These are things worth auditing today:

1. Does your review pipeline treat AI-assisted code differently?

There is a reasonable argument that AI-generated code deserves higher scrutiny — not just because it can be subtly wrong, but because it can be confidently wrong in ways that look clean. If your current process does not differentiate, consider adding a tag to PRs containing significant AI code.

2. Do your security controls hold up when velocity increases?

The two-person rule exists for a reason. Staging environments exist for a reason. Integration tests exist for a reason. If you are using AI tools to ship faster, make sure you have not quietly deprioritized those controls in the name of speed. Amazon had the rules. The rules simply were not enforced.

3. Is anyone thinking about blast radius?

When an AI agent has broad permissions — read/write across the entire codebase, access to deployment pipelines, ability to run migrations — a single misguided action can have cascading effects. The question is not whether the AI will make a mistake (it will, eventually), but whether your setup limits the damage when it does. The principle of least privilege applies to AI tools the same way it applies to service accounts.

4. Are you mandating adoption or enabling it?

This point is for engineering leaders. There is a significant difference between giving your team access to good tools and creating the conditions for them to succeed with them, versus setting a usage target and calling it strategy. The first builds competence. The second builds pressure to skip process.


Senior Approval Is Not a Long-Term Solution

Amazon’s measure — requiring senior approval for AI-assisted changes — is a reasonable short-term response. It is also incomplete.

It creates a bottleneck exactly in the wrong place. Senior engineers are already the scarcest resource in most teams. Routing all AI-assisted PRs through them slows down the speed that was the whole point of adopting AI tools. And it assumes senior engineers can reliably detect bugs introduced by AI in review — which is not guaranteed, particularly when the code looks clean but fails in production at scale.

The right long-term direction is to build a process that matches the pace of the tools: automated test controls that catch AI-generated regressions, staging environments with real production-scale traffic, clear permission boundaries for agentic tools, and team competence to review AI output rather than simply accept it.

Senior approval is a patch. Building a governance layer is the solution.


The Industry Will Learn This. The Question Is How.

Amazon is not the only company running into this. They are simply the most visible example because their outages affect tens of millions of users and are covered by the Financial Times.Each engineering team adopting AI tools right now is running the same experiment at their own scale. Most won’t have a six-hour public outage that forces reflection. Some will have quieter versions of the same problem — bugs that shipped, incidents that were explained away, speed gains that came with reliability costs that nobody officially measured.

The teams that come out best positioned from this period won’t be the ones that moved fastest. They’ll be the ones that moved with judgment — building governance alongside adoption, rather than rushing to bolt it on retroactively after the blast radius hits.

Amazon had to learn it with a $100M+ e-commerce platform and a mandatory all-hands meeting. You have the option to learn it more cheaply.