The Memory Makers Take Control

High-bandwidth memory is sold out through 2026. Not scarce. Not tight. Sold out.

SK Group Chairman Chey Tae-won stated at Computex 2026 that the global shortage of high-bandwidth memory chips will persist through at least 2030 , according to TechTimes. Days later, Nvidia CEO Jensen Huang walked to the SK Hynix booth on the Computex exhibition floor, picked up a marker, and wrote "Please Make More" on an HBM4E wafer on display . The world's most valuable company was publicly begging its supplier to move faster. The supplier's chairman had just said he needed until the end of the decade.

That exchange captures where AI infrastructure stands in mid-2026: demand so intense that even Nvidia—which controls an estimated 80-85 percent of data center AI accelerator market by revenue , per Presenc AI—cannot secure enough of the memory its chips require. The bottleneck is not silicon logic or packaging capacity. It is memory bandwidth. And the three companies that control it—Samsung, SK Hynix, and Micron—are now the gatekeepers of the entire AI buildout.

Why Can't They Just Make More?

The economics are brutal. Micron disclosed in earnings materials that producing one unit of HBM3E consumes approximately three times the wafer capacity required to produce the same number of bits in DDR5 , TechTimes reported. TrendForce puts the ratio even higher: one gigabyte of HBM requires the equivalent of four gigabytes of standard DRAM in wafer area . Every wafer diverted to AI accelerators is a wafer that cannot produce memory for laptops, smartphones, or cars.

The result is a zero-sum reallocation. Data centers now consume an estimated 70% of all memory chips produced worldwide , according to IDC, leaving consumer electronics manufacturers scrambling for scraps. TrendForce said it expects average DRAM memory prices to rise between 50% and 55% this quarter versus the fourth quarter of 2025 , CNBC reported. TrendForce analyst Tom Hsu told CNBC that type of increase for memory prices was "unprecedented."

The supply cannot scale quickly. A single B300 GPU consumes 96 DRAM dies — and a fully configured DGX B300 system with eight GPUs requires 768 DRAM dies just for the HBM modules alone , per Tech Insider. TSMC's advanced packaging capacity—the CoWoS process that bonds HBM stacks to GPU dies—remains the choke point. TSMC has been expanding CoWoS capacity from roughly 35,000 wafer starts per month in late 2024 toward a projected 120,000 to 130,000 per month by the end of 2026 , TechTimes noted, but demand still outpaces it.

Who Wins When Memory Becomes the Constraint?

The memory manufacturers, obviously. Samsung, SK Hynix, and Micron are enjoying significantly higher margins as DRAM prices surge. SK Hynix, which has been the leading supplier of HBM chips, has seen its revenue from AI-related memory products more than triple since 2024 , according to Tech Insider. SK Hynix is considering a U.S. listing as its stock price in South Korea surges, and in October, the company said it had secured demand for its entire 2026 RAM production capacity , CNBC reported.

But the winners extend beyond memory makers. SoftBank, which announced a commitment to develop and operate 5 GW of AI data center capacity in France, representing an investment of up to €75 billion, with the first phase comprising an initial €45 billion investment to deliver 3.1 GW of AI data center capacity in the Hauts-de-France region , is betting that whoever controls the physical infrastructure—power, cooling, real estate—will capture value even if chip supply remains constrained. SoftBank founder and CEO Masayoshi Son said France's position as a major energy producer and exporter was "absolutely decisive" in the decision, Data Center Knowledge reported.

The losers are clear. Lenovo, Dell, HP, Acer and ASUS have warned clients of tougher conditions ahead, confirming 15-20% hikes and contract resets as an industry-wide response to memory shortages, IDC noted. Leaders at tech companies including Apple Inc., Alphabet Inc., and Tesla Inc. have been speaking about the impact of the shortage on profitability and even timelines for AI progress. Google DeepMind's Demis Hassabis called it a "choke point" for the industry. On Tesla's earnings call in late January, Chief Executive Officer Elon Musk even raised the idea of producing his own memory chips , Bloomberg reported.

Can Inference Save the Economics?

The industry's hope is that inference—running trained models to generate output—will prove less memory-hungry than training. The math is compelling. Inference will account for 65% of AI compute by 2029 and 80-90% of lifetime AI costs. Inference projected to reach 65% of AI compute by 2029, representing 80-90% of lifetime AI system costs , according to Introl. Training AI models is a cost center, while inference is a "profit center" that directly generates revenue , Data Center Knowledge noted.

But inference is not a panacea. OpenAI CEO Sam Altman noted that he had never seen usage grow this fast, openly stating that OpenAI's GPU resources are now fully saturated. As a result, large models like GPT-4.5 must be released in stages, initially limited to Pro users due to the sheer scale of required compute , per TSPA Semiconductor. The surge in generative AI applications—image generation, video synthesis, real-time agents—is pushing inference demand to levels that rival training.

AMD is gaining ground in inference workloads, where its MI300X chip's 192GB memory capacity versus H100's 80GB offers an advantage. AMD market share rose to approximately 5-7 percent on the strength of MI300X and MI325X inference adoption; Microsoft and Meta are the largest deployers , Presenc AI reported. But even AMD cannot escape the memory constraint. The same HBM shortage that limits Nvidia limits everyone.

What Changed This Week

SoftBank's France announcement shifted the narrative from chip supply to power supply. The €75 billion commitment—larger than any prior European AI infrastructure deal—signals that the constraint is no longer just semiconductors but the physical infrastructure to run them. At the same time, Computex 2026 made clear that memory supply will not catch up to demand before 2027 at the earliest, and possibly not until 2030. The industry is adjusting to a new reality: AI scaling is now gated by memory bandwidth, not logic performance.

What to Watch

TSMC's June 4 annual general meeting may provide updated guidance on CoWoS packaging capacity expansion timelines. Micron's fiscal Q3 2026 earnings, expected in late June, will offer the first detailed look at HBM pricing power and 2027 supply commitments. And watch whether Microsoft, Google, or Amazon announce their own memory fab investments—the ultimate signal that hyperscalers believe the shortage is structural, not cyclical.

The Memory Makers Take Control

Why Can't They Just Make More?

Who Wins When Memory Becomes the Constraint?

Can Inference Save the Economics?

What Changed This Week

What to Watch

More from Stake & Paper

Mining claims intelligence — from query to report, in minutes.

The Memory Makers Take Control

Why Can't They Just Make More?

Who Wins When Memory Becomes the Constraint?

Can Inference Save the Economics?

What Changed This Week

What to Watch

Keep Reading

The $750 Billion Chip Crunch

The Memory Makers Take the Wheel

The Memory Wall: AI's $2 Trillion Bottleneck

More from Stake & Paper

Mining claims intelligence — from query to report, in minutes.

One morning brief. The whole energy sector.