Nvidia's Jensen Huang walked onto the Computex show floor in Taipei this week, picked up a marker, and wrote three words on an HBM4E wafer: "Please Make More."
The world's most valuable chipmaker was publicly begging its top memory supplier to accelerate production. Hours earlier, SK Group Chairman Chey Tae-won had told the same audience that the global shortage of high-bandwidth memory would persist through 2030, according to TechTimes. The exchange captured the defining constraint of the AI buildout: not chips, not power, but memory. Specifically, the stacked DRAM modules that sit alongside every GPU and determine how much data an AI model can process at once.
HBM is sold out through 2026, with allocations for 2027 already being negotiated, and the real power in AI infrastructure is shifting to the memory makers: Samsung, SK Hynix, and Micron.
SK Group Chairman Chey Tae-won stated at Computex 2026 that the global shortage of high-bandwidth memory chips will persist through at least 2030, driven by AI systems that require far more wafer capacity per chip than conventional DRAM.
Why Is Memory the New Chokepoint?
Nvidia's Blackwell platform has arrived, enabling organizations to build and run real-time generative AI on trillion-parameter large language models at up to 25x less cost and energy consumption than its predecessor.
The GB200 NVL72 provides up to a 30x performance increase compared to the same number of Nvidia H100 Tensor Core GPUs for LLM inference workloads, and reduces cost and energy consumption by up to 25x. But those gains come with a catch: Blackwell's appetite for memory is staggering.
Nvidia's B300 GPU requires eight HBM chips, each containing 12 individual DRAM dies—a single B300 GPU consumes 96 DRAM dies, and a fully configured DGX B300 system with eight GPUs requires 768 DRAM dies just for the HBM modules alone, not counting the system memory.
The economics are brutal. HBM commands significantly higher margins than standard DRAM modules used in consumer devices, and Samsung, SK Hynix, and Micron have all been aggressively converting production lines to HBM, as the revenue per wafer for HBM is estimated to be three to five times higher than conventional DDR5.
The voracious demand for HBM by hyperscalers such as Microsoft, Google, Meta and Amazon has forced the three biggest memory manufacturers to pivot their limited cleanroom space and capital expenditure towards higher margin enterprise-grade components—every wafer allocated to an HBM stack for an Nvidia GPU is a wafer denied to the LPDDR5X module of a mid-range smartphone or the SSD of a consumer laptop.
The result? TrendForce said it expects average DRAM memory prices to rise between 50% and 55% this quarter versus the fourth quarter of 2025, an increase that analyst Tom Hsu told CNBC was "unprecedented."
Lenovo, Dell, HP, Acer and ASUS have warned clients of tougher conditions ahead, confirming 15-20% hikes and contract resets as an industry-wide response.
Can AMD and Intel Break Nvidia's Grip?
AMD is making its move. AMD's data center revenue surged 34% quarter-over-quarter to $4.3 billion, with operating income up 793% year-over-year, reflecting rapid adoption of MI300 and anticipation for MI400 chips.
The latest MI350 series, launched in June 2025, features 288 GB of HBM3E memory and offers day-zero support for major AI frameworks, libraries, and cutting-edge models.
The company is positioning itself as the cost-effective alternative. The MI300X and MI350 series are widely seen as direct competitors to Nvidia H100 and H200, often offering a lower cost per unit of memory bandwidth, and for enterprise buyers, the total cost of ownership may be more favorable with AMD, especially for long-term workloads that benefit from large HBM3/HBM3E memory pools.
Intel, meanwhile, is scrambling to stay relevant. Intel Xeon 6 was selected as the host CPU for Nvidia's DGX Rubin NVL8 systems, reinforcing Intel's continued role at the center of leading AI infrastructure deployments. But its Gaudi accelerator line remains a distant third. Nvidia's $20 billion licensing deal with Groq underscores the pivot to inference, while AMD has acquired Untether AI's engineering team, and Intel is pursuing a SambaNova acquisition reportedly valued at about $1.6 billion.
The real battleground is shifting from training to inference. Nvidia CEO Jensen Huang said inference already accounts for more than 40% of AI-related revenue—and predicted that it is "about to go up by a billion times."
Inference will account for 65% of AI compute by 2029 and 80-90% of lifetime AI costs, with inference projected to reach 65% of AI compute by 2029, representing 80-90% of lifetime AI system costs.



