Qualcomm paid $3.9 billion for a company most developers have never heard of.
The chipmaker agreed to acquire Modular, an AI software firm, in an all-stock deal announced Wednesday , according to MarketScreener. Modular builds software that allows AI models to run across different hardware architectures — including chips from multiple vendors — without requiring developers to rewrite code for each processor , the company said. The timing is not subtle. The acquisition moves Qualcomm into territory long dominated by Nvidia's CUDA platform, which has cemented developer loyalty to Nvidia's hardware , Reuters reported. For Qualcomm, buying Modular is a direct challenge to that lock-in. The acquisition gives Qualcomm a foothold in the AI software segment dominated by Nvidia's CUDA , according to NewKerala.
The deal matters because it signals where the AI infrastructure fight is heading: not just faster chips, but the software layer that determines which chips developers actually use. Qualcomm CEO Cristiano Amon called the industry's shift toward "disaggregated, multi-vendor architectures" a moment that demands "a more open and modern software foundation" , per Investing.com. Translation: Nvidia's moat is software, and Qualcomm just bought a shovel.
Can You Make AI Cheaper Without Making It Worse?
A different bet landed the same week. AI memory startup Engram raised $98 million from investors including General Catalyst, Kleiner Perkins, Sequoia, and OpenAI co-founder Andrej Karpathy , CNBC reported. The round valued the eight-month-old company at $600 million , according to Calcalist. Engram has 13 employees.
The pitch is straightforward. Engram claims its models can match or outperform frontier labs using up to 100 times fewer tokens , CNBC noted. Tokens are the billing unit for AI queries — the more you use, the more you pay. Engram counts Microsoft, Notion, and legal AI startup Harvey among its customers , the company said. For context, new and more sophisticated AI models are proving pricier than previous iterations, challenging the conventional view that greater scale would lead to lower costs , per CNBC.
Engram trains models to study an organization's world and anticipate its questions in advance, forming a compact, continuously improving memory unique to each customer, with results that match or outperform frontier models using up to 100x fewer tokens , the company said in a statement. The neuroscience metaphor is deliberate. CEO Dan Biderman said his fascination with memory started in childhood, when his grandmother began losing her memory and he would try to prompt her to remember small details — an experience that shaped his academic path , according to Tech Startups.
The cost problem is real. Kleiner partner Leigh Marie Braswell described the market as facing an "explosion of data, explosion of cost" , CNBC reported. Engram's answer is to stop treating every query like the first one. If the AI already knows your organization's workflows, it doesn't need to reread the same documents every time. That compression is where the 100x claim lives.
Why GitHub Rewired Copilot's Pricing
While startups chase efficiency, GitHub made a structural change that reshapes how millions of developers pay for AI. All GitHub Copilot plans transitioned to usage-based billing on June 1, 2026, replacing premium request counts with monthly allotments of GitHub AI Credits and the option for paid plans to purchase additional usage , the company announced. GitHub said the change reflects that Copilot now powers far more complex, agentic workflows that consume far more compute, and is designed to deliver a more sustainable and reliable product experience by aligning pricing to actual usage and costs , according to a GitHub community discussion.
The shift is more than accounting. GitHub Copilot billing moved to usage-based billing with GitHub AI Credits on June 1, 2026, with subscribers receiving a monthly included allocation and usage beyond that billed at the end of the month , Developers Digest reported. Copilot code review moved to an agentic architecture that runs on GitHub Actions, and starting June 1, reviewing a pull request with Copilot counts against included Actions minutes , GitHub noted.
For developers on individual plans, the math changed overnight. The billing change fundamentally alters the economics of using Copilot: developers who previously used it freely now need to monitor consumption, and teams need to forecast AI costs the same way they forecast cloud compute, with heavy users potentially seeing effective costs increase significantly , according to ClickUp's analysis. GitHub reopened sign-ups for Copilot Student, Pro, Pro+, and Max plans gradually starting June 16, 2026, after pausing new individual subscriptions , the company said.
The broader pattern is clear: AI coding tools are moving from flat subscriptions to metered usage. That aligns incentives — vendors get paid for what you actually use — but it also means developers need to think about token budgets the way they think about cloud spend.



