Build It and They Will Come (Probably) — The hyperscalers are spending like there's no tomorrow
Hyperscalers are spending at an unprecedented pace. Wall Street is increasingly divided on whether the returns will ever justify the bill.
The Numbers Are Getting Ridiculous
In the first three months of 2026, the nine biggest spenders in AI infrastructure burned through $163.2 billion in capital expenditure. That is an 86% jump from the same quarter last year. For the full year, the projection now sits at $836 billion. To put that in perspective, that is more than the annual GDP of Switzerland, the Netherlands, or Turkey. It is a number that would have sounded like a typo when ChatGPT launched in 2022. These figures come from the NextGig AI CapEx Tracker, which tracks hyperscaler spending in real time from company filings and guidance.
Amazon is the biggest spender at a projected $200 billion for the year, followed by Microsoft at $190 billion and Alphabet at $185 billion. Those three alone account for nearly 70% of all AI infrastructure spending on the planet. Meta has carved out fourth place at $135 billion, and then there is a steep drop to Oracle at $50 billion, OpenAI at $30 billion, NVIDIA at $22 billion, and Anthropic at $15 billion. Apple brings up the rear at a modest $8.7 billion, which in any other context would be an enormous number but here looks like pocket change.
FY 2026 CapEx by Company
Company | Q1 2026 (Filed) | FY 2026 (Projected) |
Amazon | $44.2B | $200B |
Microsoft | $30.9B | $190B |
Alphabet | $35.7B | $185B |
Meta | $19.8B | $135B |
Oracle | $18.6B | $50B |
OpenAI | $8.5B | $30B |
NVIDIA | — | $22B |
Anthropic | $3.5B | $15B |
Apple | $2.0B | $8.7B |
The physical infrastructure behind these numbers is equally staggering. The NextGig AI CapEx Tracker follows 39 major data center facilities across the United States. Twenty three are already live and 16 more are under construction, together pulling 7.1 gigawatts of power. The biggest single site is the Anthropic Amazon campus in New Carlisle, Indiana, which draws 1.1 GW on its own. The xAI Colossus complex in Memphis spans two buildings totaling 727 MW. OpenAI's Stargate facility in Abilene, Texas runs at 295 MW with room to expand. Meta is building a single 2 GW facility in Louisiana that will be the largest standalone data center ever constructed.
And the commitments keep growing. Amazon has pledged $100 billion for multiyear US data center expansion. Microsoft has earmarked $80 billion. Google committed $75 billion. Meta dropped $65 billion on that Louisiana megaproject alone. These are not quarterly budget decisions that can be walked back next earnings call. These are generational bets that will either look brilliant or catastrophic a decade from now, with very little middle ground.
Why Everyone Keeps Writing Bigger Checks
If you talk to the analysts who track this space daily, a clear consensus emerges: the hyperscalers are not building ahead of demand. They are chasing demand that already exists and cannot be served with current capacity.
The phrase you hear over and over is "compute constrained, not demand constrained." Amazon CEO Andy Jassy has said it. OpenAI's Sam Altman has said it. Meta and Google have said versions of it. The argument is simple: every major cloud provider would be growing faster right now if they had more GPUs to sell. Customers are waiting in line. The constraint is not finding people who want to use AI. The constraint is having enough hardware to let them.
The usage data backs this up. OpenRouter, which tracks token consumption across hundreds of models and providers, shows total weekly inference volume at roughly 24.7 trillion tokens in late May 2026. That number has roughly doubled in six months. More importantly, the composition of that demand is shifting toward applications that consume far more compute per user. Coding assistants, autonomous agents, and enterprise automation tools now represent 79% of the top 50 apps by token volume. A single agentic coding session can consume more tokens than a hundred casual ChatGPT queries. As these use cases scale, the demand curve steepens rather than flattens.
NVIDIA sits at the center of all of this with a 78% market share in GPU shipments, but analysts are increasingly focused on the networking layer, not the silicon. The release of NVLink5, NVIDIA's switched fabric that lets GPUs talk to each other at unprecedented speeds, has been described by multiple tracked analysts as "more important than Blackwell" and potentially "more important than CUDA." The logic is straightforward: as AI workloads move toward mixture of experts architectures that require every GPU to communicate with every other GPU simultaneously, the networking fabric becomes the real bottleneck on performance. A faster chip connected by a slow network is a slow system. NVIDIA controls both layers, and that integration creates a moat that a faster chip alone cannot breach.
But here is where the narrative gets interesting. GetDeploying data shows that B200 GPUs, NVIDIA's current generation offering, are widely available. Zero providers report B200 as "unavailable," meaning anyone who wants B200 capacity can get it. On demand pricing sits at $6.38 an hour, up 5.5% week over week, and spot pricing jumped 12.5%. This is the mirror image of what happened with H100s in 2023 and 2024, when waitlists stretched for months and spot prices spiked to multiples of on demand rates. The B200 market is liquid. Supply has caught up to demand for the current generation, and that is a new development that deserves attention.
Not Everyone Is Cheering
A counter narrative has been building among a subset of tracked analysts, and it deserves to be taken seriously. The core question is uncomfortable: what if the hyperscalers are building capacity for an exponential demand curve that is already shifting to linear?
The first crack in the bull case is the deceleration in inference token growth. OpenRouter data shows week over week token growth slowing to just 0.9% in late May, down from the double digit rates that characterized the earlier phase of the cycle. In absolute terms, 24.7 trillion tokens a week is enormous. But the hyperscalers are building capacity for a world where that number keeps doubling every few months. If growth settles into a steadier, more predictable trajectory, the capacity overhang could be severe.
The second crack is GPU rental pricing on older hardware. GetDeploying shows H200 spot instances fell 8% week over week to $3.35 an hour. H100 on demand pricing declined 4% to $3.80 an hour. These are not catastrophic numbers, but they are moving in the wrong direction at a time when the installed base of last generation hardware is expanding rapidly. Several analysts have pointed to GPU rental prices as the single best real time indicator of supply and demand balance, and the current trajectory is not reassuring.
The third crack is the capital cycle framework itself. The SwiftAlerts Capital Cycle Monitor, which aggregates credit market conditions, GPU pricing, inference demand, physical capacity data from Epoch.ai, and sentiment signals from tracked analysts, has been flashing caution. The combined stance improved from RED to AMBER in late May, but the system still flags six warning signals. These include softening GPU prices, accelerating chip stockpiles (Google's H100 equivalent GPU count surged 221% quarter over quarter, an eye popping number), and bearish pressure building on several portfolio holdings. The action recommendation shifted from "reduce AI complex by 30%" to "no new AI buys, tighten trailing stops." Less aggressive, but hardly a buying signal.
Then there is the Jim Cramer indicator, which the Capital Cycle Monitor tracks as a contrarian mainstreaming signal. In the 14 days through late May, Cramer posted 88 AI and technology related tweets, surging from effectively zero. The system flags this as RED. When Jim Cramer is tweeting about AI infrastructure around the clock, the narrative has reached every living room in America, and historically, that is when the smart money starts heading for the exits.
The Real Bottleneck Nobody Can Fix Quickly
While the public conversation focuses on GPU supply and model capabilities, a quieter but potentially more consequential problem is brewing. The American power grid was not designed to absorb 7.1 gigawatts of new data center load in three years, and the lines to connect new projects now stretch three to five years in key markets like Northern Virginia.
This is already a binding constraint, not a future one. The xAI Colossus complex in Memphis draws 727 megawatts, roughly the peak demand of a midsized city. Meta's Louisiana project will draw over 2 gigawatts. Every single one of the 39 major AI data centers tracked by NextGig is in the United States, concentrated in a handful of regions where power is available and tax treatment is favorable. As those regions hit capacity, new projects face escalating delays.
A growing number of analysts have started arguing that power availability, not GPU supply or model capability or even access to capital, will be the binding constraint on AI infrastructure in the 2027 to 2028 window. If interconnection queues force data centers into secondary markets with less developed fiber and cooling infrastructure, both cost efficiency and deployment speed will suffer. The hyperscalers are effectively in a land grab for gigawatt scale power access, and the winners of that race will be determined as much by their utility negotiation teams as by their technology organizations.
Reading the Actual Tea Leaves
Beyond the analyst commentary, several real time data sources offer an unfiltered view of where this trade is actually heading.
South Korean semiconductor exports, the most reliable leading indicator for global chip demand, hit $31.9 billion in April 2026, up 173.5% year over year. The three month average sits at $29.95 billion, a 53.4% increase from the prior three month period. The cycle has not peaked: March 2026 was the all time high at $32.8 billion. These figures are published monthly by the Korea Customs Service. As long as Korean exports are accelerating, physical chips are flowing into data centers somewhere, and that somewhere is overwhelmingly the American hyperscaler complex.
The equity options market tells a more divided story. NVIDIA options flow is extraordinary: 1.6 million calls traded on May 27, the highest in the entire semiconductor complex, with a put to call ratio of 0.46 and nearly twice as many calls outstanding as puts. This is institutional conviction expressed through derivatives. But the rest of the semi complex looks very different. Micron shows 945,000 puts outstanding versus only 518,000 calls, a structural bearish position despite the overwhelmingly bullish daily narrative around HBM memory and AI demand. Marvell saw puts surge by 38,000 contracts in a single day while calls actually declined. Ciena, up 600% from its lows, shows a put to call ratio of 1.72 and deteriorating flow. The options market is sending a clear message: NVIDIA is the only semiconductor you need to own for the AI infrastructure thesis. Everything else in the complex looks risky. Options data sourced from Yahoo Finance.
Polymarket prediction markets add a macro overlay. The probability of the US economy being in an overheating state by the end of 2026 rose to 56%, up 2.5 points in a week. Fed rate cut expectations have consolidated around two to three cuts for the year. A soft landing, the Goldilocks scenario for risk assets, saw its probability drop 14.5 points in a week to 22%. The market is pricing in continued growth with persistent inflation pressure, which is great for nominal revenues but terrible for the long duration assets that data centers represent when financed at a 5% 30 year Treasury yield.
Where This Leaves Us
The AI infrastructure buildout is the defining capital allocation event of this decade. At $836 billion annually and still accelerating, it represents a simultaneous bet by the world's largest and most sophisticated technology companies that artificial intelligence will generate returns commensurate with the biggest infrastructure deployment in business history.
The bull case is well funded and well argued. AI demand is real. The hyperscalers keep telling us they are constrained by compute, not by customers. NVIDIA's networking advantage deepens with every generation. The Korean export data confirms that chips are still flowing at record volumes.
The bear case is less popular but arguably more urgent. Inference growth is decelerating. Last generation GPU pricing is softening. The capital cycle framework says wait. Jim Cramer will not stop tweeting about it. And the options market is betting that NVIDIA is the only semi worth owning for this thesis, which is a remarkably narrow expression of a supposedly broad based industrial transformation.
The question that will define the next two years is not whether AI infrastructure spending will continue. It will. The commitments are locked in. The question is whether the $836 billion being deployed in 2026 will generate returns that justify the spend, or whether we are watching the largest capital misallocation since the dot com bubble play out in slow motion. The Korean export data says the chips are being bought. The OpenRouter data says they are being used. The GPU rental data says some of them are already getting cheaper. The answer probably lies somewhere between those three signals, and the analysts who study this full time are watching all of them intently.
Sources
AI CapEx & Data Center Data: NextGig AI CapEx Tracker
GPU Rental Pricing: GetDeploying
Inference Token Volume: OpenRouter
AI Physical Capacity Data: Epoch.ai
Korea Semiconductor Exports: Korea Customs Service / EIEC
Prediction Markets: Polymarket
Equity Options Data: Yahoo Finance
Analyst Commentary: SwiftAlerts Database
Post a Comment
0 Comments