AI at Amazon: A case study of brittleness

135 points by azhenley a day ago

yandie a day ago

I was with Amazon but wasn't part of Alexa. I was working closely with the Alexa team however.

I remember vividly the challenge of building centralized infra for ML at Amazon: we had to align with our organization's "success metrics" and while our central team got ping ponged around, and our goals had to constantly change. This was exhausting when you're trying to build infra to support scientists across multiple organizations and while your VP is saying the team isn't doing enough for his organization.

Sadly our team got disbanded eventually since Amazon just can't justify funding a team to build infra for their ML.

artyom 4 hours ago
I was at Amazon but wasn't part of Alexa. I remember taking a look at their code. It was an endless spaghetti of if statements. The top one started like this
```
  if(tenant == "spotify") { ...
```
and everything else was downhill from there.
The rest of the description on how Amazon operates is quite accurate. Impossible for anyone to do anything meaningful anymore.
giancarlostoro a day ago

> Amazon just can't justify funding a team to build infra for their ML.
Sounds like they didn't plan it out correctly. It should have been done in phases, one team at a time, starting with the Alexa team, or the smallest team with the smallest amount of effort as a test bed, while keeping the other teams informed in case they have suggestions or feedback for when their turn comes along.
didip a day ago

Isn't this a challenge for any big tech companies. Success metrics tend to be attached to a particular product and yet... central infra is a necessary foundational block for everyone, but it is without a specific success metrics.
- yandie a day ago
  
  We were working with folks from other central ML infra teams in other companies. Amazon is the one that doesn't have any funding for central ML infra (they have the central infra for software). Having interacted with many companies after leaving Amazon, they are just really bad in terms of investing in central ML.
  
  foota a day ago
  
  Is this unique to ML, or does Amazon have an issue funding central infra?
  
  Jensson 17 hours ago
  
  Amazon did AWS, so they can do central infra. Possibly the massive success of AWS makes them expect the same level of self funding for other infra projects even if those would be better served as internal only with internal funding.
laidoffamazon a day ago

For me the big problem was how hard Hoverboard was to use - they built nicer tooling around it eventually but getting onboarded took weeks to months, getting GPUs usable enough to do LLM training would be nigh-impossible, an _extremely rigorous_ dedication to customer security meant transferring any data into and out of it for analysis was a major pain....
I remember being in the office when GPT2 dropped and thinking the entire Alexa Engine/skill routing codebase became outmoded overnight. That didn't really happen, but now that MCP servers are so easy to build I'm surprised Alexa doesn't just use tool-calling (unless it does in Alexa+?)

PaulHoule a day ago

Some of it is the rapid progress in fundamental research.

If in Q2 2025 a company like AAPL or AMZN decides to invest in a current top of the line neural network model and spend 18 months to develop a product, whatever they develop might be obsolete when it is released. Holds for OpenAI or any incumbent -- first mover advantage may be neutralized.

Secondly there are a lot of problems in ambient computing. Back in the early 00's I often talked with an HCI expert about ideas like "your phone (pre-iPhone) could know it is in your backpack" or "a camera captures an image of your whole room reflected in a mirror and 'knows what is going on'" and she would usually point out missing context that would make something more like a corporation that gives bad customer service than a loyal butler. Some of Alexa's problems are fundamental to what it is trying to do and won't improve with better models, some of why AMZN gave up on Alexa at one point.

mrweasel a day ago

It's also not entirely clear what Alexa was suppose to do, nor Siri for that matter. Being a personal digital assistant turned out to be much less useful than many imagined and being a voice controlled Bluetooth speaker is mostly a gimmick outside the car or kitchen.
That's not to say that Alexa and others can't be useful, but just not to enough people that it justifies the R&D cost.
- cle a day ago
  
  Meanwhile multiple non-technical people that I know pay $20/mo to OpenAI and have long, verbal conversations with ChatGPT every day to learn new things, explore ideas, reflect, etc.
  These are obviously what voice assistants should do, the research was just not there. Amazon was unwilling to invest in the long-term research to make that a reality, because of a myopic focus on easy-to-measure KPIs. After pouring billions of dollars into Alexa. A catastrophic management failure.
  
  mrweasel a day ago
  
  Are they talking to ChatGPT, or are they typing? More and more we're seeing that user don't even want to use a phone for phone calls, so maybe a voice interface really isn't the way to go.
  Edit: Oh, you wrote "verbal" that seems weird to me. Most people I know certainly don't want to talk to their devices.
  
  ljf a day ago
  
  My wife paid for ChatGPT and is loving it - she only types to it so far (and sends it images and screenshots), but I've had a go at talking to it and it was much better than I thought.
  If I'm alone I don't mind talking if it is faster, but there is no way I'm talking to AI in the office or on the train (yet...)
  
  throwaway314155 a day ago
  
  > If I'm alone I don't mind talking if it is faster
  When is talking faster than text? I only ever use it when my hands are tied (usually for looking up how to do things while playing a video game).
  
  Retric a day ago
  
  When you can talk at your normal pace?
  People talk at about 120WPM - 160WPM naturally, few can type that fast which is why stenographers have a special keyboard and notation.
  
  PaulHoule a day ago
  
  I feel tiredness in my throat when I talk to bots like Alexa as you have to enunciate in a special way to get across to them.
  
  Retric a day ago
  
  Sure, it defiantly doesn’t work for everyone. I think it’s accent or something dependent as some people’s natural voice comes across fine.
  
  throwaway314155 a day ago
  
  I struggle to have naturally flowing conversation with an AI for much the same reason people don't use most of Siri's features - it's awkward and feels strange.
  As such I can maintain about five minutes of slow pace before giving up and typing. I have to believe others have similar experiences. But perhaps I'm an outlier.
  
  conception a day ago
  
  I know quite a few folks that chat with the gpts. Especially while committing in the car. Also there are niche uses like language practice.
- taeric a day ago
  
  I continue to be baffled that they are going to cannibalize the "voice controlled radio and timer" market in the chase of some "magic assistant" one.
  It would be one thing if they were just adding extra "smart home" features to connect new terminals. I can see benefit of some of the smart screen calendar and weather things. No, they seem dead set to completely kill what they had.
com2kid a day ago

IMHO we are at the point now where ambient computing is fully possible using locally self hosted AI and LLMs.
I explore the ideas more in a post at https://meanderingthoughts.hashnode.dev/lets-do-some-actual-... but the tl;dr is ~24-32GB of VRAM in a server shoved in a basement can do a lot. Imagine a machine that is fully owned by the user with no corporate spying. It can listen 24/7 to everything in the house, using technology like wifi location sensing it knows what room people are in (it is a thing now!) and can relay messages for a person to the closest speaker.
Even 8B parameter LLMs are great at ambiguous inputs, and new TTS models are weighting in under 1B parameters.
Connecting the LLM up to everything a person wants to do is the real issue. Homekit integrations exist, but Homekit isn't exactly a mass consumer technology. What RabbitOS aims to do is IMHO the proper path: Drive an android phone to accomplish tasks.

coredog64 a day ago

Not this article, but the one that it references:

> And most importantly, there was no immediate story for the team’s PM to make a promotion case through fixing this issue other than “it’s scientifically the right thing to do and could lead to better models for some other team.” No incentive meant no action taken.

Oof!

simplesimon890 a day ago

Unfortunately it aligns with internal pressure. If you are not working on something that has clear quantifiable/promotional benefits that can be realized within 2-3 quarters, you are at risk of the PIP train. Having buy in from senior management can help but in a company that re-orgs regularly, your manager can change, so the risk is higher.
- gloryjulio a day ago
  
  ^This. If this is only about less pay, lots of people would actually do it instead of chasing bonuses and promotions.
  But in a cut throat environment, you can't afford to not move the corporate metrics in quarterly reviews. Otherwise you will get pipped or fired
wnevets a day ago

> No incentive meant no action taken.
sounds like late stage capitalism
- SoftTalker a day ago
  
  Sounds like simple poor management to me. If you're going to do R&D work or invent something you hope to be able to sell, you have to accept that value may not be realized immediately and there may be some paths explored that don't end up leading to anything.
  If you're going to sell T shirts, then sure, have quarterly goals.
  
  wnevets a day ago
  
  > If you're going to sell T shirts, then sure, have quarterly goals.
  The issue provided in the example is about helping other teams out without an expectation of a reward, not about short term vs long term gains.
  
  nine_zeros 18 hours ago
  
  Yep. Fundamentally, tech R&D requires management that will be ok with trial and error, and ambiguous timelines.
  The predictable unit production/sales way of management works for duplicatable, repeatable goods.
- reliabilityguy a day ago
  
  > sounds like late stage capitalism
  Yes. Because with socialism and communism the management cares about everyone’s benefits and not their own skin?
draw_down a day ago

[dead]

awsthrowaway1 a day ago

(Throwaway account because I work at Amazon)

Everyone at Amazon is focused on AI right now. Internal and external demand for GPU resources and model access is off the charts. The company's trying to provide enough resources to do research, innovate, and improve business functions, while at the same time keeping AWS customers happy who want us to shut up and take their money so they can run their own GPUs. It's a hard problem to solve that all the hyperscalers share.

xigency a day ago

I used to be at Amazon on the B2B retail side and at the tail end we got automated A.I. spamming our TT's with completely wrong summaries. Some great "progress." Internal search had similar "improvements" that tipped the balance of 'good enough' toward 'non-functional.'
carbocation a day ago

For whatever it's worth, I have found the AI summary of customer reviews helpful.
belter a day ago

> Everyone at Amazon is focused on AI right now.
That explains the lack of progress on anything else on the other services....

taormina a day ago

Even if the word AI was not involved, this just sounds like Amazon's standard, well documented toxicity.

jnaina 20 hours ago

Worked at AWS in the past. This resonates. Let's of missed chances. For example, the Alexa for Automotive team initially impressed with a well-produced demo video that painted a compelling vision of in-car voice integration. However, that vision unfortunately never translated into real-world execution. Despite having a major opportunity to collaborate with one of Asia’s largest automotive manufacturers—an ideal partner to bring Alexa into the vehicle ecosystem—there was little to no follow-through. The opportunity was essentially ignored.

From my perspective, one of the core issues was cultural. The Alexa teams were often staffed by long-timers with substantial RSU grants, many of whom appeared more focused on preserving internal influence and career security than driving bold external partnerships or innovation. It indeed felt less like a team pushing the envelope, and more like a collection of fiefdoms guarding their territory.

In the end, it was a missed opportunity—not just for Alexa, but for Amazon to play a central role in the connected car revolution.

AnotherGoodName a day ago

I remember Google getting rid of a large part of it's Assistant team at the same time Amazon laid off a large part of its Alexa team ~18 months ago. You can even read the "is Amazon closing down Alexa?" rumours in that time.

It's pretty clear that LLMs with action hooks were going to take over from the old bespoke request->response methods so i guess they were trying to make sure there was no old guard holding on in the changeover. Only just now are the Alexa+ and Google home Gemini integrations becoming available to pick up where they dropped off in 2023.

Apple had a few Siri layoffs but it seemed to keep the ship steady. It'll be interesting to see which was the better long term approach though.

bitpush a day ago

> Apple had a few Siri layoffs but it seemed to keep the ship steady.
I'm curious how you came to this conclusion against Assistant and Alexa teams. It isnt like Assistant & Alexa shutdown completely, and Siri was uniquely left alone.
Siri's missteps are well documented. If anything, from the quality & speed with which the product is evolving, it seems like Siri might be more resource constrained.

alex-mohr a day ago

And you could write a similar blog post about why Google "failed" at AI productization (at least as of a year ago). For some of the same and some completely different reasons.

  - two competing orgs via Brain and DeepMind.

  - members of those orgs were promoted based on ...?  Whatever it was, something not developing consumer or enterprise products, and definitely not for cloud.

  - Nvidia is a Very Big Market Cap company based on selling AI accelerators.  Google sells USB Coral sticks.  And rents accelerators via Cloud.  But somehow those are not valued at Very Big Market Cap.

Of course, they're fixing some of those problems: brain and DeepMind merged and Gemini 2.5 pro is a very credible frontier model. But it's also a cautionary tale about unfettered research focus insufficiently grounded in customer focus.

keeda a day ago

> But it's also a cautionary tale about unfettered research focus insufficiently grounded in customer focus.
I got the exact opposite takeaway: despite Amazon and Google being pioneers in related areas, both failed to capitalize on their headstarts and kickstart the modern AI revolution because they were hobbled by being grounded in customer focus.

BryanLegend a day ago

The new Alexa+ is super great, if you've been invited to it.

The voice recognition is at a whole nother level, much much faster. Controlling lights is easily an entire second faster.

The TTS upgrade is a trip. She sounds younger and speaks faster.

diggan a day ago

> Controlling lights is easily an entire second faster
How long time did it used to take VS how long time does it take now? I'm not sure "an entire second faster" is sarcasm here, big improvement or what.
- BryanLegend a day ago
  
  There used to be second or two between finishing a command and it getting executed. We easily use Alexa 20x a day at my house. Announcements, kitchen timers, lights & music.
  I think the voice recognition is async now. It's streaming the data to a model. Before it would wait until the command was finished then send the .wav file off to a model.
WalterBright a day ago

> Controlling lights is easily an entire second faster
I just use the Clapper from the 1970s.
https://www.amazon.com/Clapper-Activated-Detection-Appliance...
- BryanLegend a day ago
  
  We easily use Alexa 20x a day at my house. Announcements, kitchen timers, lights & music. Even checking store hours or weather.
  Replaces a phone in many cases.
  
  WalterBright a day ago
  
  More like the phone replaces Alexa, as Alexa cannot do phone calls.
  
  BryanLegend 21 hours ago
  
  Alexa can do calls. We usually only use that to find a lost phone.
  
  WalterBright 18 hours ago
  
  LOL, I didn't know that.
- otterley a day ago
  
  That works great if you're in the room, but not so much if you're not home (or aren't even in the same room), or want to control landscape lighting and want it to automatically adapt to the seasons.
  
  WalterBright a day ago
  
  1. I have no trouble at all hitting the light switch when I enter a room, and hitting it again when I leave. Even when the power goes out for a few days, I still reflexively hit the switch.
  2. I tried landscape lighting once. The local fawna chewed it to bits. I decided I didn't need it.
  3. I don't need to control the lights when I'm not home.
  4. I have seriously no need to light according to the seasons.
  Sure, a more automated system would be great for disabled people. But I'm not disabled, and intend to lift my sorry heiny out of the chair as long as I am able to.
nsonha a day ago

is it "ChatGPT in a box" level yet? If not why has that not been a thing?
- BryanLegend a day ago
  
  It is. It will make up a story if you ask it to. Answers questions with it's own knowledge now instead of using the amazon answers site.
- PartiallyTyped a day ago
  
  The problem with ChatGPT in a box and similar is that they are not made for live interactions. If you've tried chatGPT or claude on your phone with voice conversations you will see that it takes a while to think.
  Humans on the other hand start processing the moment there's a response and will [usually] respond immediately without "thinking", or if they are thinking, they will say as much, but still respond quite quickly.
  
  nsonha 19 hours ago
  
  Not unlike voice mode on the ChatGPT mobile app, pretty responsive to me. Existing voice assistants do not set a high bar.
  
  throwaway314155 a day ago
  
  Surely whatever solution they came up with amounts to "ChatGPT in a box" (using an llm with fewer parameters for speed).

mensetmanusman a day ago

A good CEO would have forced alexa to become the conversational AI in the world once GPT4 dropped.

joot82 a day ago

Sure, long term that would probably drive or lock a lot of customers to them and make them not bother about ChatGPT anymore. However, Amazon might have a lot of compute, but I'm not sure if it's enough to enable every Alexa speaker with state of the art conversational AI. OpenAI and Google constantly have to screw with the rate limits to satisfy demand.

ywxdcgnz a day ago

Does anyone know what we learned? Looks like the second problem (decentralization) is the antidote to the first (centralization) which was rather amusing.

Probably one of those things which are actually perfectly fine trade-offs, operational challenges and in no way causal to the demise. If Alexa had found a market, the same article would probably be called "AI at Amazon: a case study of <insert management buzzword>" and explain to us how the same processes paved the path to success.

mwkaufma a day ago

Indeed, how does a "customer obsessed" organization cram features down their customers throats that they don't want?

gilmore606 a day ago

They're customer-obsessed the way Dracula is a neck enthusiast.

dangus a day ago

The elephant in the room is that Amazon sucks to work at.

Every ex-Amazon employee I’ve worked out talked about their burnout culture.

They under-compensate compared to their peers and as this article touches on with the discussion about customer focus, their corporate culture is the most draconian and abnormal.

I did an early stage Amazon interview and they basically wanted me to memorize every detail of their company culture and relate every single one of those aspects to a piece of work I did in the past. They wanted me to demonstrate that I had essentially joined their cult before I even had the opportunity to join it!

I have no idea how someone is supposed to honestly complete that interview process without outright lying.

captain_coffee 3 hours ago

"Amazon sucks to work at" is the conclusion that I have reached about 8 years ago after reading testemonial after testemonial from current or former employees at the company.

adolph a day ago

    This introduced an almost Darwinian flavor to org dynamics where teams 
    scrambled to get their work done to avoid getting reorged and subsumed into 
    a competing team.

To the extent that an organization is so wealthy and vast that it can fund redundant efforts, isn't getting reorged into the "winning" team a good thing?

Symmetry a day ago

If a successful research program would take longer than it would take to be absorbed that can be a problem.
saratogacx a day ago

If you are at Amazon and you get re-org'd into a winning team. Unless you are a clear out rock star, that team will use the absorbed body to act as the sacrifice once review season comes.
SpicyLemonZest a day ago

It's not the worst thing in the world, but it's usually much better for your career to have other people reorged into your team. Even if the reorg doesn't reset promotion progress for everyone who moves (which it often does!) that way you get to demonstrate leadership ramping the new folks up on how things work in your neck of the woods.

aaroninsf a day ago

There is a small pleasure to be had in the fact that the relentless decent into dystopian surveillance capitalism was apparently momentarily retarded by such venal banalities as employee recognition and compensation schemes having entirely devolved at FAANG into "career hacking" gimicks.

aaroninsf a day ago

What's most striking to me in this article is how perfectly this sums up the ongoing collapse of America's political system and civil society:

"In the paper Basic Patterns in How Adaptive Systems Fail, the researchers David Woods and Matthieu Branlat note that brittle systems tend to suffer from the following three patterns:

- Decompensation: exhausting capacity to adapt as challenges cascade

- Working at cross-purposes: behavior that is locally adaptive but globally maladaptive

- Getting stuck in outdated behaviors: the world changes but the system remains stuck in what were previously adaptive strategies (over-relying on past successes)"

Painfully apt.

captain_coffee 3 hours ago

Very applicable to the UK as well
mensetmanusman a day ago

In the context of demographics is destiny, replace America with ‘world’ as every country is dropping the ball with confronting their ongoing collapse.
- gsf_emergency 16 hours ago
  
  Not making babies requires only working at cross-purposes :)
  The other 2 it's not clear if there are symptoms universal across economies.