What NVDA Investors Need To Know About Nvidia's Biggest Challenge Yet

Jason Sargent Wednesday, March 18, 2026

What NVDA Investors About Nvidia's Biggest Challenge Yet

Daragh ThomasWed, March 18, 2026 at 3:31 PM UTC

Nvidia Corporation has built a $4.4 trillion empire selling chips for training AI models, but the AI business, previously defined by massive training runs, may soon not require the same amount of chips.

Hyperscalers are still spending heavily on training, but the priority has shifted to inference, the real-time computing that actually delivers AI to end users.

Jensen Huang has been calling 2026 the year inference takes over, and the numbers back him up.

OpenAI and Anthropic are producing thousands of times more inference tokens than a year ago as agentic AI workloads explode.

Don't Miss:

This AI Helps Fortune 1000 Brands Avoid Costly Ad Mistakes — See Why Investors Are Paying Attention

This Energy Storage Company Already Has $185M in Contracts—Shares Are Still Available

But Nvidia’s bestselling Grace Blackwell servers may not be the right hardware for the job.

Users say the systems consume too much energy and lack the memory for efficient inference.

‘There Is No Moat In Inference’

Cerebras CEO Andrew Feldman is leading the charge.

He told The Wall Street Journal that Nvidia’s proprietary CUDA software ecosystem, the moat that locked in developers for training, simply doesn’t apply to inference.

Cerebras recently signed Amazon Web Services as a customer and landed a deal with OpenAI reportedly worth over $10 billion for 750 megawatts of compute through 2028.

In a recent Benzinga interview, Feldman called the $20 billion Groq deal a strategic admission that GPU dominance is ending.

Cerebras shelved a previous IPO attempt in October 2025 and has since refiled confidentially, with a public listing possible as early as April.

Trending: Own the Characters, Not Just the Content: Inside a Fast-Growing Pre-IPO IP Company

Nvidia Isn’t Standing Still

At GTC on Monday, Huang unveiled Nvidia’s answer to the inference threat.

The Groq 3 LPU is a new type of chip built specifically for inference, not repurposed from training hardware.

It came out of the $20 billion licensing deal Nvidia struck with AI chip startup Groq in December, and is designed to process user queries at higher speeds and lower cost per token than Nvidia’s existing GPUs.

Meta Platforms is already buying in.

The company announced a long-term infrastructure partnership that includes deploying thousands of Nvidia’s Vera CPUs without any GPUs attached, a first for the company.

Deepwater’s Gene Munster recently told Benzinga he expects Nvidia’s revenue growth in calendar 2027 to hit 40%, well above the Street’s 28% estimate, driven largely by inference demand.

Nvidia CFO Colette Kress said the company remains confident. “Right now, we’re the king of inference.” Bettors agree, for now.

On Polymarket, traders still give Nvidia a 70% chance of remaining the world’s largest company at year-end.

GINGER MAG

Hot

HOT OFFERS

Popular Posts

HOT OFFERS

What NVDA Investors Need To Know About Nvidia's Biggest Challenge Yet

Total Pageviews

Sponsored

Popular This Week