Engineering Economics

AI Token Spend Is Getting Out of Control

OpenClaw burns $1.3 million a month on tokens. Uber torched its 2026 AI budget in four months. Anthropic spends more on AWS than it makes in revenue. The math behind the new normal, and where it leaves the rest of us.

Yacine Kahlerras
Yacine KahlerrasSoftware Engineer, Platform & UX at TurboDocx
May 22, 202611 min read

TL;DR

The short version

After reading every spend-disclosure story I could find from the last six months.

  • 01

    Token spend is no longer a story about API line items. It is a story about whole engineering budgets, whole quarters, and whole company P&Ls. Uber blew its yearly AI budget in four months. OpenClaw spends more on tokens in 30 days than most engineering orgs spend on infrastructure in a year.

  • 02

    The model vendors are not making this back. Anthropic spent about $2.66B on AWS in the first nine months of 2025, more than its estimated revenue. OpenAI is reportedly spending $2.25 for every $1 it earns. Microsoft disclosed an OpenAI loss of $12B in a single quarter.

  • 03

    Whether this is the next dotcom bubble or the new normal is genuinely unclear. What is clear is that the engineering teams who do not get a grip on token spend right now are about to have a very expensive conversation with finance.

The OpenClaw experiment

Peter Steinberger, the founder of the open-source project OpenClaw, posted a screenshot in May 2026 that turned a lot of heads.[1] His three-person team had an OpenAI API bill of $1.3M/ 30 days, covering 603 billion tokens and 7.6 million requests, almost all of it on GPT-5.5.[2]

The setup is striking. They keep roughly 100 Codex instances running in the cloud. The agents review PRs, find security holes in commits, deduplicate issues, and write fixes. Some agents open new PRs based on the project vision. Others monitor benchmarks and report regressions in Discord. Agents even listen in on team meetings and start PRs for features the humans discussed.[1]

Steinberger framed it as a research question. He said he is exploring how software would be built if token costs did not matter at all. He noted that disabling Fast Mode would cut the bill by 70%, which would bring it down to a more pedestrian $300K.[3] On the ROI question, he answered “pretty high,” pointing out that everything they build is open source and works against multiple model providers.

There is one detail buried in the story that matters more than the headline number. Steinberger joined OpenAI in February 2026. OpenAI is paying the bill.[4] It is not a small business burning through a Series A. It is an internal research project sponsored by the lab that sells the tokens. The interesting question is not whether $1.3M a month is sustainable for OpenClaw. It is whether the experiment shows anything that generalizes to teams who have to pay their own way.

Uber, four months, gone

The OpenClaw story is interesting because it is voluntary. The Uber story is interesting because it is not. According to reporting from The Information, with downstream coverage in Briefs and AI Magazine, Uber spent its entire 2026 AI budget by April. Four months in. Claude Code was the dominant driver, with Cursor a secondary tool whose adoption plateaued.[5]

The shape of the curve is worth pausing on. Uber started rolling out Claude Code in December 2025.[6] By March, 84% of Uber engineers were on the tool. By the time the budget ran dry in April, around 95% of Uber engineers were using AI tools monthly and an estimated 70% of committed code was originating from AI in some form.[5]

Heavy users were running up between $500 and $2,000 a month in API costs. The average across the engineering org settled in around $150 to $250 per engineer per month.[6] Multiply that across the org and the line item moves fast. Uber CTO Praveen Neppalli Naga reportedly said, “The budget I thought I would need is blown away already.”[5]

The pattern is not unique to Uber. It is what happens any time a useful, addictive, per-token-priced tool gets put in front of every engineer with no enforced budget. Adoption curves are convex. Bills follow the same shape.

The scoreboard

Five datapoints from the last six months. Together they tell a story that is bigger than any one company.

  1. OpenClaw (Peter Steinberger)

    $1.3M/ per month

    603 billion tokens, 7.6 million requests, 100 Codex instances, three-person team. OpenAI is picking up the bill.

  2. Uber engineering

    entire 2026 budget/ in 4 months

    Heavy Claude Code users were running $500 to $2,000 per month. 95% of engineers on AI tools monthly by the time the budget hit zero. Source: The Information, April 2026.

  3. Anthropic

    $2.66B/ AWS, 9 months

    Monthly AWS bill grew from $188M in January 2025 to $519M by September. That is more than the company is estimated to have made in revenue.

  4. Cursor (Anysphere)

    $70M/ AWS, 9 months

    Jumped from $1.5M in January 2025 to $13M by September. Shipped Composer to claw back margins; still loses money on individual Pro subs.

  5. OpenAI itself

    $14.1B/ projected 2026

    Inference cost alone. Deutsche Bank pegs OpenAI at $2.25 spent per $1 of revenue. Microsoft disclosed a $12B OpenAI loss in a single quarter.

Tokenmaxxing as a productivity metric

Somewhere in the last six months, large tech companies started keeping leaderboards of who burns the most tokens. Kevin Roose first covered the tokenmaxxing trend in the New York Times in March 2026,[7] and The Information broke the specific Meta “Claudeonomics” dashboard story in April.[8] The employee-built dashboard tracked token consumption across 85,000+ Meta employees. The top performer at the time of the report had averaged 281 billion tokens over 30 days. The internal title for the top of the chart is “Token Legend.”[9]

OpenAI employees reportedly compete on informal token-consumption leaderboards as well, with one engineer logging 210 billion tokens in a single week.[7] Nvidia's Jensen Huang told the All-In podcast at GTC 2026 that any engineer he employs on a $500,000 salary who does not consume at least $250,000 in tokens a year should make him “deeply alarmed.”[10]

Not everyone is sold. Jon Chu at Khosla Ventures called it “an absolutely stupid policy” on X, pointing to reports of Meta engineers writing bots that burn tokens on autopilot to climb the leaderboard.[11] The signal-to-noise problem in tokenmaxxing is obvious to anyone who has ever measured a developer team by lines of code.

Faros AI's April 2026 study, which TechCrunch reported on, is the most useful pushback I have read.[12] Across 22,000 developers in 4,000 teams, the data showed real gains and real costs side by side.

MetricChangeWhat it means
Task completion+34%Developers ship more discrete tasks per week with AI assistants.
Epics shipped+66%Larger units of work make it through the pipeline faster.
Bugs introduced+54%Bug volume is up at roughly the same rate as throughput.
Median review time5x slowerReviewers spend longer per PR. Some of the productivity is being borrowed from reviewers.

Throughput goes up. So does bug volume and code churn. Median review time gets five times longer. If you only measure tokens consumed, you do not see the second half. If your performance review system rewards tokens consumed, you actively select for the wrong thing.

Two sides of the argument

I read a lot of takes putting this together. The defenders and the skeptics tend to talk past each other, so it helps to put their quotes next to each other.

Defenders

"Exploring how software would be built if token costs did not matter."

Peter Steinberger, OpenClaw

On running 100 agents at $1.3M a month. Bill is paid by OpenAI.

"If an engineer on a $500,000 salary did not consume at least $250,000 worth of tokens, I am going to be deeply alarmed."

Jensen Huang, Nvidia

On the All-In podcast. The CEO framing of tokenmaxxing.

Skeptics

"AI is not designed to solve the complex problems that would justify the costs."

Jim Covello, Goldman Sachs

Head of Global Equity Research. The bear case from inside Wall Street.

"Absolutely stupid."

Jon Chu, Khosla Ventures

On engineers building bots to burn tokens on autopilot for the Meta leaderboard.

Why the vendors are not getting rich

Here is the part that surprised me. The conventional read on a hype cycle is that the picks-and-shovels companies make a fortune while the buyers lose their shirts. The token economy is not behaving that way. Both sides are bleeding.

Ed Zitron published a numbers-heavy piece in October 2025 that has been holding up well.[13] According to his reporting, Anthropic spent $1.36B/ 2024 AWS on an estimated $600M of revenue. Through the first nine months of 2025, the company spent $2.66B/ 2025 AWS against an estimated $2.55B of revenue. Their monthly AWS bill went from $188M in January 2025 to $519M in September. None of this includes Google Cloud, which Anthropic also uses.

OpenAI's picture is structurally the same. Deutsche Bank has been cited as estimating the company spends $2.25 for every $1 of revenue.[14] Microsoft, which held the largest external stake, absorbed a roughly $4.1B equity-method loss in its FY26 Q1 filing — implying an OpenAI quarterly loss of approximately $12B/ single quarter when grossed up by Microsoft's then ~32.5% ownership.[15] Inference cost alone is projected at $14.1B for 2026.[16] Sam Altman has publicly acknowledged that the $200 per month ChatGPT Pro tier loses money on heavy users.[17]

If you take the disclosure at face value, the model vendors are not making a profit on each request you send. They are charging less than the request costs them to run. That is not a sustainable equilibrium. Either prices go up, the models get cheaper to serve, or some of the businesses on the supply side end up restructured.

The implication for buyers is uncomfortable. The cheap-API era is being subsidized. The current per-token price is not the equilibrium price. If you build a product whose unit economics only work at today's API rates, you are exposed to a price hike you do not control.

What to actually do about it

I do not think the answer is “use less AI.” The tools are real, the productivity gains are real, and the teams that step back will get out-shipped by the teams that learn to use them well. The answer is to treat tokens the way a mature engineering org treats every other unit of consumption. Like a budget item with a dashboard and an owner.

# A pragmatic checklist before you ship an AI workflow to production
# Not a config. The questions you should be able to answer in a meeting.
1. What is the per-task cost budget?
- Pick a number. $0.10 per code review. $2 per generated SOW.
- Anything that exceeds the budget needs a human approval or a fallback.
2. Which model is doing which job?
- Stop sending every request to your frontier model.
- Cheap classifier first, expensive reasoner only when needed.
3. Are you caching prompts?
- System prompts, tool definitions, codebase context.
- For Anthropic and OpenAI, this is a single-flag change with order-of-magnitude savings.
4. What happens when an agent loops?
- A hard step budget. A wall-clock budget. A token budget.
- Without limits, a single broken loop can spend more than a quarter's bill.
5. Who watches the bill?
- One person owns the dashboard. Weekly review.
- Anomalies get paged. Surprises do not happen.
6. What is the fallback when the API is down?
- It will go down. Plan for it.
- Cached responses, smaller local models, or a graceful degradation path.

Most of the checklist above is boring on purpose. The teams getting hurt by token spend right now are not the ones running exotic agents. They are the ones who shipped an AI feature, forgot about it, and let a bug in a loop scale to the size of an OpenAI bill. None of that requires sophisticated tooling to prevent. It requires somebody whose job it is to look at the dashboard.

Is this a bubble?

I think the honest answer is “maybe, but the bubble framing is a trap.” Goldman Sachs projects roughly $7.6 trillion in cumulative AI capex between 2026 and 2031.[18] Gartner's January 2026 forecast had worldwide AI spending hitting $2.52 trillion in 2026, a 44% jump year over year; the firm revised that up to $2.59 trillion and 47% growth in May.[19] Those are numbers that make the dotcom build-out look modest.

At the same time, MIT economist Daron Acemoglu's analysis estimates only about 4.6% of all tasks will be cost-effective to automate inside a 10-year window, because only a fraction of AI-exposed tasks clear the cost-benefit threshold.[20] Goldman's Jim Covello, the loudest internal skeptic, has argued AI “isn't designed to solve the complex problems” that would justify the cost.[21] A September 2025 Harvard Business Review piece, based on BetterUp Labs and Stanford Social Media Lab research, pegs the productivity hit from low-quality AI “workslop” at over $9M annually for a 10,000-person org.[22]

The dotcom comparison is interesting because both readings turned out to be right. The bubble was a real bubble. A lot of capital got destroyed. The internet was also a real revolution. The companies that survived to the other side became the largest businesses in history. The framing “is it a bubble” misses that both can be true at once. You can be in a bubble that is paying for the infrastructure of the next decade.

The job for the rest of us, the people who are not running a $1.3M-a-month OpenAI bill, is to use the tools well, watch the meter, and not bet our business on a per-token price that the vendor is currently subsidizing. Our backend team's comparison of Cursor, Claude Code, and OpenCode goes into the day-to-day workflow patterns. The piece we wrote on the junior developer crisis in AI-native teams talks about who actually benefits when the tools scale. And if you are setting up an engineering org with these tools, our CLAUDE.md guide goes into the briefing discipline that determines whether you spend $50 a developer or $5,000.

Sources and citations

  1. The Decoder, “For $1.3 million a month, OpenClaw founder Peter Steinberger runs 100 AI agents,” May 2026. the-decoder.com
  2. Tom's Hardware, “OpenClaw creator burns through $1.3 million in OpenAI API tokens in a single month,” May 18, 2026. tomshardware.com
  3. The Next Web, “OpenClaw founder on the $1.3 million OpenAI token bill,” May 2026. thenextweb.com
  4. TechCrunch, “OpenClaw creator Peter Steinberger joins OpenAI,” February 15, 2026. techcrunch.com
  5. The Information, “Uber CTO Shows How Claude Code Can Blow Up AI Budgets,” April 2026. theinformation.com
  6. Briefs, “Uber torches entire 2026 AI budget on Claude Code in four months,” April 17, 2026. briefs.co
  7. Kevin Roose, The New York Times, tokenmaxxing column, March 21, 2026. techmeme.com archive
  8. The Information, “Meta shutters internal AI token leaderboard,” April 2026. theinformation.com
  9. Fortune, “Meta killed its employee AI token dashboard,” April 9, 2026. fortune.com
  10. Tom's Hardware on Jensen Huang's All-In Podcast remarks at GTC 2026. tomshardware.com
  11. Kingy AI, “Tokenmaxxing: Silicon Valley's most controversial new status game,” with Jon Chu X post excerpts. kingy.ai
  12. Faros AI, “AI Acceleration Whiplash” research (22,000 developers, 4,000 teams), April 2026. faros.ai / TechCrunch coverage
  13. Ed Zitron, “This Is How Much Anthropic and Cursor Spend On Amazon Web Services,” Where's Your Ed At, October 20, 2025. wheresyoured.at
  14. AI Insights News summary of Deutsche Bank's OpenAI cost-to-revenue analysis. aiinsightsnews.net
  15. Thurrott, “OpenAI lost $12 billion in the previous quarter” (Microsoft FY26 Q1 equity-method disclosure, grossed up by Bernstein). thurrott.com
  16. Where's Your Ed At, leaked OpenAI financial documents on projected inference costs. wheresyoured.at/oai_docs
  17. Fortune, “Sam Altman says OpenAI is losing money on ChatGPT Pro subscriptions,” January 7, 2025. fortune.com
  18. Goldman Sachs Insights, “Tracking trillions: The assumptions shaping the scale of the AI build-out.” goldmansachs.com
  19. Gartner, “Worldwide AI Spending,” January 2026 forecast ($2.52T/44%) and May 2026 revision ($2.59T/47%).
  20. Daron Acemoglu, “The Simple Macroeconomics of AI,” NBER Working Paper 32487. nber.org
  21. Goldman Sachs Research, “Gen AI: Too Much Spend, Too Little Benefit?” June 2024. goldmansachs.com (PDF)
  22. Harvard Business Review, “AI-Generated ‘Workslop’ Is Destroying Productivity,” September 2025 (BetterUp Labs and Stanford Social Media Lab research). hbr.org

Frequently asked questions

Related reading

Stop burning tokens on boilerplate

The TurboDocx API handles document generation, e-signatures, and template workflows with deterministic primitives — variables, loops, conditionals — instead of LLM calls. Reach for the model when you need reasoning, not when you need to render a DOCX.

Yacine Kahlerras
Yacine KahlerrasSoftware Engineer, Platform & UX at TurboDocx