AI token bills blow through corporate budgets, metered pricing replaces flat plans as agent tools multiply usage, Linux Foundation launches Tokenomics standards push

Images

Rebecca Bellan techcrunch.com

Uber’s internal AI coding budget for 2026 was gone by April, according to TechCrunch, as companies that once treated “unlimited” model access as a perk begin putting hard limits on who can prompt what. The same story describes developers losing access when Microsoft revoked Claude Code licenses months after enabling them, and one large travel company seeing a routine Cursor renewal jump several-fold.

The shift is not that tokens have become expensive in isolation. TechCrunch reports that per-token prices have fallen, but consumption has risen faster as teams move from occasional chat to agent-style tools that run longer, call more models, and generate more intermediate text. Newer flagship models released late last year—TechCrunch names Anthropic’s Claude Opus 4.5, OpenAI’s GPT-5.1, and Google’s Gemini 3 Pro—made these workflows more capable, which in practice meant more output and more billable input. The result is a budgeting problem that looks familiar from the early cloud era: executives pushed for adoption first, then asked finance to explain the invoice.

A small ecosystem is forming to make those invoices legible. The Linux Foundation has announced a Tokenomics Foundation, pitched as a standards effort for AI spend tracking akin to FinOps for cloud costs. OpenAI’s enterprise head Alexander Embricos tells TechCrunch that customer conversations have moved away from raw capability and toward auditability, usage controls, and efficiency. The anecdotes in the piece point to why: without guardrails, one engineer can burn a month’s budget in days, and a team can discover too late that “productivity” is being measured in generated text rather than shipped value.

Even the companies that can afford the bills are struggling to price what they are buying. TechCrunch cites surveys suggesting output rises alongside bugs and rewrites, and reports that heavy token users can appear twice as productive while consuming vastly more tokens than their peers. That gap matters because most firms cannot tie model usage to revenue impact; they can count tokens, but not the value of the code those tokens helped produce. Vendors, meanwhile, have an obvious reason to prefer metered billing: it turns uncertain demand into a customer problem, and it makes runaway usage a line item rather than a platform outage.

The industry’s new discipline is being built out of throttles, quotas, and accounting dashboards, not out of cheaper intelligence. Companies that sold “all you can eat” are now selling the measuring cup.