Chips 2 min read

NVIDIA's B200 and the insatiable demand for

Jensen Huang held up the B200 at GTC and the audience reacted the way audiences react at concerts when the lights come up on the headliner.

Cheering. For a chip.

I was there. I cheered too. I’m not proud of it. But 208 billion transistors, 20 petaflops of FP4 inference, on a single chip? That’s worth a reaction.

The numbers

The B200 is NVIDIA’s next-generation GPU for AI training and inference. Compared to the H100 (which is already the most sought-after chip in the world), the B200 offers roughly 30x improvement in inference performance for large language models. Not 30%. 30x.

The improvement comes from several architectural changes: faster memory, better interconnects, native support for lower-precision formats (FP4 and FP6) that are good enough for inference but require far less compute. The chip can run a GPT-4-class model at speeds that make real-time interaction feel instantaneous.

208 billion transistors. Built on TSMC’s 4nm process. Packaged using CoWoS (Chip on Wafer on Substrate) advanced packaging. The physical artifact is a marvel of engineering.

The demand problem

Every B200 NVIDIA manufactures is already sold. Literally pre-sold. The waiting lists for NVIDIA’s AI chips are measured in quarters, not weeks. Microsoft, Google, Amazon, Meta, Oracle, and every hyperscaler on Earth are fighting for allocation.

The demand for AI compute is growing faster than the supply can ramp. Training runs for frontier models require thousands of GPUs running for months. Inference for billions of users requires even more. Every chatbot query, every AI-generated image, every coding assistant suggestion burns GPU cycles.

SemiAnalysis estimates that the total demand for AI training compute in 2025 will exceed available supply by roughly 2x. That means even with every chip manufacturer running at full capacity, there aren’t enough GPUs to train all the models that companies want to train.

This is a new kind of shortage. Not a supply chain disruption (like the 2021 chip shortage caused by COVID). A structural demand overshoot. The appetite for AI compute is growing exponentially while manufacturing capacity grows linearly.

The GTC experience

GTC felt like a technology conference and a revival meeting combined. Jensen Huang’s keynote was two hours of product announcements delivered with the energy of someone who genuinely believes he’s building the future (because he probably is). The audience was engineers, researchers, and executives from every major technology company.

The leather jacket. The way he holds up the chip and turns it slowly, letting the light catch the surface. The cadence of his speech, building to each announcement like movements in a symphony. Say what you want about Jensen, the man knows how to present silicon as something sacred.

And maybe it is, in a way. These chips are the substrate on which the most important technology of our era runs. Every AI breakthrough, every language model, every autonomous vehicle perception system, runs on NVIDIA hardware. The chip isn’t just a product. It’s the foundation.

What concerns me

The concentration. NVIDIA has approximately 80% market share in AI training chips. That’s not a market. That’s a monopoly with competition at the edges.

AMD is making progress. Their MI300 series is competitive on some workloads. Intel’s Gaudi chips are finding niches. Custom silicon from Google (TPUs) and Amazon (Trainium) serves those companies’ internal needs.

But for the broader market, for every AI startup, every research lab, every enterprise that wants to run AI workloads, the only real option is NVIDIA. And NVIDIA’s prices reflect that: the B200 is expensive and the demand ensures they never have to discount.

The risk is dependency. The entire AI industry depends on one company’s chip roadmap. If NVIDIA stumbles, if a manufacturing defect delays the next generation, if their software stack (CUDA) fails to keep pace, the impact ripples through every AI project on Earth.

I’d feel better if there were three companies competing at the frontier instead of one. Competition makes everything better: prices, performance, innovation, reliability. Monopolies make everything fragile.

NVIDIA knows this. Jensen acknowledges the competition and encourages it. But knowing the risk and mitigating it are different things. And right now, every AI company’s roadmap has “NVIDIA” written at the foundation layer.

The B200 is extraordinary. The demand is insatiable. The concentration is concerning. All three of these things are true at the same time.


Related thinking:

a

astro

Thinking about AI, robots, space, and the future. Writing it down so I don't forget.