The Memory Wall: AI's Hidden Bottleneck

HBM is sold out through 2027. All three manufacturers—Samsung, SK Hynix, and Micron—have locked their production capacity with hyperscalers willing to pay premiums that make consumer electronics look like an afterthought. According to industry reports, all three major suppliers of high-bandwidth memory have sold out their 2026 production capacity, with the shortage expected to persist through at least 2027 . Data centers now consume an estimated 70% of all memory chips produced worldwide , leaving laptop and smartphone makers to compete for scraps.

The numbers tell the story. The five largest hyperscalers—Amazon, Alphabet, Microsoft, Meta, and Oracle—are projected to spend around $700 billion on capital expenditure in 2026, with approximately 75% directly tied to AI infrastructure . Goldman Sachs estimates $765 billion in annual AI capex in 2026, growing to $1.6 trillion by 2031 . But capital alone won't solve the problem. A single Nvidia B300 GPU requires eight HBM chips containing 96 DRAM dies—and a fully configured DGX B300 system with eight GPUs needs 768 DRAM dies just for HBM modules , not counting system memory.

The economics are brutal. HBM revenue per wafer is estimated to be three to five times higher than conventional DDR5 , which means every wafer allocated to AI accelerators is a wafer denied to consumer devices. IDC describes it as a zero-sum game: every wafer allocated to an HBM stack for an Nvidia GPU is a wafer denied to the LPDDR5X module of a mid-range smartphone or the SSD of a consumer laptop . The result? TrendForce projects average DRAM memory prices will rise between 50% and 55% this quarter versus Q4 2025 —an increase analyst Tom Hsu called "unprecedented."

Can Chipmakers Build Their Way Out?

Not quickly. SK Hynix has committed over $30 billion to new advanced packaging and fabrication plants in the U.S. and South Korea, while Micron is investing $20 billion in its Idaho mega-fabs and $7 billion in a new facility in Singapore—but due to long lead times, this new capacity is not expected to significantly alleviate the shortage before 2026-2027 . Micron's new fabs in Boise, Idaho, will start producing memory in 2027 and 2028, with a Clay, New York facility expected online in 2030 .

The manufacturing complexity makes the problem worse. HBM is produced in a complicated process where Micron stacks 12 to 16 layers of memory on a single chip, turning it into a "cube" . A single silicon wafer provides three times as much commodity DRAM as HBM, and fab processing time for HBM is significantly longer, making the supply problem worse—producing more HBM equates to fewer total memory chips produced .

Meanwhile, TSMC is printing money. The foundry reported consolidated revenue of NT$1,134.10 billion and net income of NT$572.48 billion for Q1 2026, with revenue increasing 35.1% year-over-year and net income up 58.3% . In the first quarter, 3-nanometer accounted for 25% of total wafer revenue, 5-nanometer for 36%, and advanced technologies defined as 7-nanometer and more advanced accounted for 74% of total wafer revenue . TSMC's May 2026 revenue was approximately NT$416.98 billion, up 30.1% from May 2025 .

The foundry is racing to keep pace. TSMC's 3-nanometer technology has become the industry standard for high-performance computing, artificial intelligence, and mobile devices, while its 1.6-nanometer process will be ready for commercial production in the second half of 2026 . TSMC is manufacturing the world's largest 5.5-reticle size CoWoS, with greater than 98% yield in 2026 .

What About the Chip Wars?

Nvidia still dominates, but the landscape is shifting. Starting this fall, Nvidia's new RTX Spark Superchip will debut in laptop and desktop computers from Dell and Lenovo—a combination of microprocessor and graphics chip, built with help from Taiwan's MediaTek, that will run Microsoft's Windows for Arm operating system . Nvidia's sales in its most recent quarter were roughly equal to Intel and AMD's annual totals for last year .

But AMD is gaining ground. AMD has signed lucrative contracts with OpenAI and Meta Platforms to deploy a combined 12 gigawatts of chips for their AI data centers, with consensus estimates projecting a 76% increase in AMD's earnings in 2026 to $7.33 per share . AMD's Q1 2026 data center revenue surged 57% year-over-year to a record $5.8 billion, with Q2 guidance of $11.2 billion representing 46% year-over-year growth acceleration .

Intel, meanwhile, is fighting for relevance. Intel disclosed at the VLSI Symposium on June 16, 2026, that its next generation 18A-P node has entered risk production . Intel 18A-P delivers 9% higher performance at iso-power or 18% lower power at iso-performance compared to Intel 18A, alongside enhanced thermal characteristics . President Trump announced on June 18, 2026, that Apple has agreed to work with Intel to design and build chips domestically, though neither company has issued formal confirmation .

The GPU hierarchy is clear. The H200 is the direct H100 successor in the Hopper line, using the same GH100 compute die but upgrading the memory subsystem from 80 GB HBM3 to 141 GB HBM3e at 4.8 TB/s bandwidth—delivering 37-90% faster LLM inference on 70B+ parameter models . When comparing single GPUs, the Blackwell B200 GPU demonstrates a performance increase of approximately 2.5 times that of a single H200 GPU, based on tokens per second . As of June 2026, H100 SXM5 on-demand starts at $2.53/hr on Spheron and H100 PCIe from $2.01/hr, with spot pricing on H100 SXM5 reaching as low as $1.43/hr .

Where Does the Money Go?

Into infrastructure at a scale that makes previous buildouts look quaint. McKinsey estimates global spending on data centers could reach $7 trillion by 2030 . The capital expenditure of the 14 largest publicly owned data center operators globally is seen close to $750 billion this year against a little less than $450 billion last year . Over 23 gigawatts of data center capacity was under construction globally at the end of September 2025, with about three quarters of it coming up in the US—and over 3.8GW of new capacity entered its construction phase in Q3 2025, up 58% on the quarterly average so far this decade .

The shift from training to inference is accelerating. Deloitte estimates inference made up half of all AI compute in 2025, and this number will grow to two-thirds in 2026, with Brookfield projecting inference will take up 75% of all AI compute needs by 2030 . Ram Nagappan, vice president of AI infrastructure at Oracle Cloud Infrastructure, said operators must now design for two fundamentally different AI patterns: large-scale training and distributed inference .

Power, not capital, is the constraint. Sean James, distinguished engineer for energy systems at Nvidia, said power availability—not compute—is emerging as the limiting factor, with operators increasingly relying on on-site generation to accelerate deployment . Approximately 70% of the US grid is approaching the end of its life cycle, and unprecedented load growth is exposing the aging nature of the grid .

What Changed This Week

The memory shortage moved from technical concern to economic crisis. IDC called the memory chip crunch "a crisis like no other," with big tech companies on track to spend a staggering $650 billion in 2026, up about 80% from last year's record—and even if chipmakers ramp up production, potential relief from the shortage is more than a year away . Intel's 18A-P node entering risk production offers a glimmer of hope for foundry diversification, but TSMC's dominance remains unchallenged. Nvidia's PC chip announcement signals the company's ambition to own every layer of the AI stack, from data center to edge.

What to Watch

TSMC's Q2 earnings call in mid-July will reveal whether the foundry can sustain its blistering growth pace. Intel's earnings on July 23, 2026, should clarify whether the Apple foundry deal is real or another headline that fades. AMD's Advancing AI conference on July 22-23 in San Francisco will detail its OpenAI and Meta deployments scheduled for H2 2026. And memory prices—watch TrendForce's monthly DRAM contract price updates. If prices keep climbing at 50%+ per quarter, consumer electronics makers will face impossible choices: raise prices, cut specs, or delay launches. The memory wall isn't coming. It's here.

The Memory Wall: AI's Hidden Bottleneck

Can Chipmakers Build Their Way Out?

What About the Chip Wars?

Where Does the Money Go?

What Changed This Week

What to Watch

More from Stake & Paper

Mining claims intelligence — from query to report, in minutes.

The Memory Wall: AI's Hidden Bottleneck

Can Chipmakers Build Their Way Out?

What About the Chip Wars?

Where Does the Money Go?

What Changed This Week

What to Watch

Keep Reading

When Memory Becomes the Bottleneck

The Gigawatt Race: AI's Power Hunger

The Debt Powering the AI Chip Race

More from Stake & Paper

Mining claims intelligence — from query to report, in minutes.

One morning brief. The whole energy sector.