The era of cheap, unlimited AI is ending — and a two-tier market for intelligence is emerging.
Coinbase Chief Executive Officer Brian Armstrong predicted that 80% of artificial intelligence workloads will shift to models costing 99% less than today's frontier systems within 12 to 18 months, as the industry confronts the unsustainability of subsidized pricing.
"The limiting factor will be energy and compute, not better models," Armstrong wrote on X on Sunday, responding to a post by investor Tommy Shaughnessy that outlined how metered API pricing is driving enterprise AI spending far beyond what flat-rate subscriptions led companies to expect. Armstrong said Coinbase is already routing prompts to cheaper models where appropriate, keeping its AI costs "roughly flat" even as token usage grows exponentially.
The Coinbase CEO's forecast comes days after Microsoft's GitHub Copilot switched from a flat subscription to token-based billing on June 1, triggering bill increases of as much as 1,700% for some users. One subscriber posted an internal cost estimate showing their monthly fee jumping to $754.29 from $44.68, while another projected a bill of $847. The pricing overhaul reflects a broader reckoning: OpenAI's operating margin is near negative 122%, according to Shaughnessy, meaning the company relies entirely on external capital to subsidize GPU purchases and inference costs.
The Two-Tier Intelligence Market
Armstrong's framework divides AI usage into two categories. The remaining 20% of workloads requiring peak performance — scientific research, agent orchestration, and what he called "IQ maxing" — will continue running on frontier models like Anthropic's Opus 4.8 or OpenAI's GPT-5.5. The other 80% will shift to cheaper alternatives, a dynamic he compared to consumer hardware, where most buyers skip maxed-out specs on MacBooks and gaming PCs.
The economics already support this divergence. DeepSeek V4 performs within range of Anthropic's Claude Opus on the SWE-bench coding benchmark at roughly one-thirtieth the cost, according to Shaughnessy. Hugging Face Chief Executive Officer Clement Delangue cited Stanford research showing local model accuracy on real-world conversation and reasoning queries rose to 71.3% from 23.2% in 2023, at a fraction of the energy and cost of API calls.
Box CEO Aaron Levie called Armstrong's 99% figure "a bit extreme" but agreed that AI use will stratify, with high-end work going to leading models and high-volume tasks to cheap ones. "Intelligence allocation is going to be extremely important," Harvey co-founder Winston Weinberg wrote. Glean co-founder Tony Gentilcore called Armstrong's analysis "spot on," adding that "the financial markets are the only ones extrapolating out Opus prices to infinite scale."
The Investment Angle
The shift toward cheaper models threatens the revenue models of premium AI providers including OpenAI, Microsoft, and Anthropic, which have relied on subsidized subscriptions to build market share. If 80% of workloads migrate to low-cost alternatives, the addressable market for frontier models shrinks dramatically. Companies enabling cost-efficient inference — including open-source model providers and routing infrastructure — stand to benefit. Nvidia, whose H100 and B200 GPUs power most frontier training, faces a more complex outlook: demand for compute may grow, but pricing power could erode as cheaper alternatives proliferate.
This article is for informational purposes only and does not constitute investment advice.