When Memory Becomes the Bottleneck

Nvidia's Jensen Huang walked onto the Computex show floor in Taipei this week, picked up a marker, and wrote three words on an HBM4E wafer: "Please Make More."

The world's most valuable chipmaker was publicly begging its top memory supplier to accelerate production. Hours earlier, SK Group Chairman Chey Tae-won had told the same audience that the global shortage of high-bandwidth memory would persist through 2030, according to TechTimes. The exchange captured the defining constraint of the AI buildout: not chips, not power, but memory. Specifically, the stacked DRAM modules that sit alongside every GPU and determine how much data an AI model can process at once.

HBM is sold out through 2026, with allocations for 2027 already being negotiated, and the real power in AI infrastructure is shifting to the memory makers: Samsung, SK Hynix, and Micron.

SK Group Chairman Chey Tae-won stated at Computex 2026 that the global shortage of high-bandwidth memory chips will persist through at least 2030, driven by AI systems that require far more wafer capacity per chip than conventional DRAM.

Why Is Memory the New Chokepoint?

Nvidia's Blackwell platform has arrived, enabling organizations to build and run real-time generative AI on trillion-parameter large language models at up to 25x less cost and energy consumption than its predecessor.

The GB200 NVL72 provides up to a 30x performance increase compared to the same number of Nvidia H100 Tensor Core GPUs for LLM inference workloads, and reduces cost and energy consumption by up to 25x. But those gains come with a catch: Blackwell's appetite for memory is staggering.

Nvidia's B300 GPU requires eight HBM chips, each containing 12 individual DRAM dies—a single B300 GPU consumes 96 DRAM dies, and a fully configured DGX B300 system with eight GPUs requires 768 DRAM dies just for the HBM modules alone, not counting the system memory.

The economics are brutal. HBM commands significantly higher margins than standard DRAM modules used in consumer devices, and Samsung, SK Hynix, and Micron have all been aggressively converting production lines to HBM, as the revenue per wafer for HBM is estimated to be three to five times higher than conventional DDR5.

The voracious demand for HBM by hyperscalers such as Microsoft, Google, Meta and Amazon has forced the three biggest memory manufacturers to pivot their limited cleanroom space and capital expenditure towards higher margin enterprise-grade components—every wafer allocated to an HBM stack for an Nvidia GPU is a wafer denied to the LPDDR5X module of a mid-range smartphone or the SSD of a consumer laptop.

The result? TrendForce said it expects average DRAM memory prices to rise between 50% and 55% this quarter versus the fourth quarter of 2025, an increase that analyst Tom Hsu told CNBC was "unprecedented."

Lenovo, Dell, HP, Acer and ASUS have warned clients of tougher conditions ahead, confirming 15-20% hikes and contract resets as an industry-wide response.

Can AMD and Intel Break Nvidia's Grip?

AMD is making its move. AMD's data center revenue surged 57% year-over-year to $5.8 billion, with segment operating income of $1.6 billion, reflecting rapid adoption of MI300 and anticipation for MI400 chips.

The latest MI350 series, launched in June 2025, features 288 GB of HBM3E memory and offers day-zero support for major AI frameworks, libraries, and cutting-edge models.

The company is positioning itself as the cost-effective alternative. The MI300X and MI350 series are widely seen as direct competitors to Nvidia H100 and H200, often offering a lower cost per unit of memory bandwidth, and for enterprise buyers, the total cost of ownership may be more favorable with AMD, especially for long-term workloads that benefit from large HBM3/HBM3E memory pools.

Intel, meanwhile, is scrambling to stay relevant. Intel Xeon 6 was selected as the host CPU for Nvidia's DGX Rubin NVL8 systems, reinforcing Intel's continued role at the center of leading AI infrastructure deployments. But its Gaudi accelerator line remains a distant third. Nvidia's $20 billion licensing deal with Groq underscores the pivot to inference, while AMD has acquired Untether AI's engineering team, and Intel is pursuing a SambaNova acquisition reportedly valued at about $1.6 billion.

The real battleground is shifting from training to inference. Nvidia CEO Jensen Huang said inference already accounts for more than 40% of AI-related revenue—and predicted that it is "about to go up by a billion times."

Inference will account for 65% of AI compute by 2029 and 80-90% of lifetime AI costs, with inference projected to reach 65% of AI compute by 2029, representing 80-90% of lifetime AI system costs.

What About the Power Problem?

The memory shortage is compounded by an energy crisis. The International Energy Agency now projects that global data center electricity consumption will exceed 1,000 TWh by the end of 2026, an amount equivalent to Japan's entire annual electricity usage.

Driven by data center investments, the capital expenditure of five large technology companies surged to more than $400 billion in 2025 and is set to increase by a further 75% in 2026, while electricity demand from data centers soared by 17% in 2025, and that of AI-focused data centers climbed even faster.

"Very soon, maybe even later this year, we'll be producing more chips than we can turn on," Tesla CEO Elon Musk said earlier this year. The constraint is real. Lawrence Berkeley National Laboratory predicts that data center demand will grow from 176 terawatt hours in 2023 (or, about 4.4% of total U.S. electricity consumption) to between 325-580 TWh (6.7-12.0%) by 2028.

Liquid cooling is emerging as the solution. Nvidia's Blackwell chip increased processing capacity while using the same amount of energy as its predecessor, but it also generated a lot more heat, too much for traditional air cooling systems—constantly running the air cooling cycle requires a lot of energy, so companies developed a direct-to-chip liquid cooling method, which can increase energy efficiency in a data center by 15%, according to a study done by Nvidia and power equipment maker Vertiv Holdings.

Direct-to-chip cooling is rapidly becoming the most common form of liquid cooling deployed in production environments, and by removing 70%-80% of heat loads directly at the chip, it reduces the burden on facility-level cooling infrastructure.

An industry survey found that 59% of data centers plan to implement liquid cooling within five years, and the share running exclusively on air continues to shrink.

What Changed This Week

Nvidia's public plea for more memory at Computex crystallized what the industry has known for months: the AI buildout is constrained not by chip design but by manufacturing capacity for specialized memory. SK Hynix's warning that shortages will persist through 2030 means every hyperscaler, every AI lab, and every enterprise deploying models is now competing for a finite pool of HBM. Meanwhile, AMD's surging data center revenue and Intel's acquisition spree signal that the inference market—where models run, not train—is becoming the new profit center. The economics have flipped: training is a one-time capital event, but inference is a recurring cost that scales with every query.

What to Watch

TSMC's 3nm capacity expansion is critical— TSMC's 3nm monthly capacity is highly likely to reach the 180,000–200,000 wafer range by the end of 2026, with potential to exceed this threshold, and including incremental capacity, TSMC's total 3nm monthly capacity in Taiwan is expected to exceed 200,000 wafers by the end of 2026, potentially reaching an above-expectation level of 220,000 wafers. Watch for Micron's new Idaho and Singapore fabs, which won't come online until 2027-2028 but represent the only meaningful HBM capacity additions on the horizon. And monitor pricing: if DRAM prices continue their 50%+ quarterly climb, expect consumer electronics to get significantly more expensive through 2027.

When Memory Becomes the Bottleneck

Why Is Memory the New Chokepoint?

Can AMD and Intel Break Nvidia's Grip?

What About the Power Problem?

What Changed This Week

What to Watch

More from Stake & Paper

Mining claims intelligence — from query to report, in minutes.

When Memory Becomes the Bottleneck

Why Is Memory the New Chokepoint?

Can AMD and Intel Break Nvidia's Grip?

What About the Power Problem?

What Changed This Week

What to Watch

Keep Reading

The Memory Wall: AI's $690B Bottleneck

The Memory Makers Take the Wheel

AI's Hardware Crunch Tightens

More from Stake & Paper

Mining claims intelligence — from query to report, in minutes.

One morning brief. The whole energy sector.