NVIDIA GTC 2026: $1 Trillion in Orders, Vera Rubin Ships H2, and the Feynman GPU Preview
Jensen Huang's keynote revealed $1T in Blackwell and Vera Rubin orders, 10x efficiency gains, Kyber rack architecture, and a 2028 Feynman GPU preview.
Jeff Brook
AI Researcher — Founder, AI Daily News
Jensen Huang's GTC 2026 keynote laid out NVIDIA's infrastructure roadmap through 2028: $1 trillion in combined Blackwell and Vera Rubin orders, a new rack-scale architecture called Kyber, a preview of the Feynman GPU generation, and commercial licensing for the GR00T N1.7 humanoid robotics platform.
For practitioners, the numbers matter less than the trajectory. Every announcement points in the same direction: AI compute is becoming a utility at data centre scale, and the unit economics are about to shift dramatically.
What does Vera Rubin change when it ships?
Vera Rubin is NVIDIA's next-generation GPU architecture, scheduled for the second half of 2026. The headline figure is 10x performance-per-watt improvement over Blackwell. That is not 10x raw performance — it is 10x efficiency, meaning the same inference workload consumes one-tenth the power.
For teams running production AI workloads, power consumption is increasingly the binding constraint. Data centres are power-limited, not space-limited. A 10x efficiency gain means either 10x more inference from the same power envelope, or the same inference at dramatically lower operating cost.
The practical impact flows through to API pricing. When the cost of running a GPU drops by an order of magnitude, the providers who deploy Vera Rubin first can undercut competitors on price while maintaining margins. Expect a pricing war in the inference market starting late 2026.
What is the Kyber rack architecture?
Kyber represents a shift from thinking about GPUs as individual cards to thinking about them as rack-scale systems. Instead of buying servers with GPUs installed, Kyber treats an entire rack as a single compute unit with unified memory, networking, and power management.
This matters for large model training and inference because it eliminates the communication bottlenecks between individual GPUs. Current architectures lose significant performance to inter-GPU data transfer. Kyber's rack-scale design reduces that overhead by treating the rack as one coherent system.
For most practitioners, Kyber is not something you will buy directly. Cloud providers and large enterprises will deploy Kyber racks, and the benefits will surface as higher throughput and lower latency on hosted inference services. But understanding the architecture matters for capacity planning — rack-scale compute changes the economics of model serving in ways that affect build-vs-buy decisions.
What does $1 trillion in orders signal?
The $1 trillion figure across Blackwell and Vera Rubin orders represents commitments from cloud providers, enterprises, and sovereign AI initiatives. It is a demand signal, not revenue — these are orders that will be fulfilled over 2026 and 2027.
The scale tells us three things:
Compute demand is not peaking. Despite periodic claims that AI scaling is hitting limits, the companies actually deploying AI infrastructure are ordering more compute, not less. The demand is driven by inference at scale — running agents, serving enterprise workloads, powering consumer products — not just training.
Sovereign AI is a real market. A meaningful portion of these orders comes from governments and regional cloud providers building domestic AI infrastructure. The geopolitical dimension of AI compute — countries ensuring they have independent access to frontier capabilities — is now a major demand driver.
The capex cycle has years to run. Teams building AI products can plan on compute availability improving and costs declining through at least 2028. The infrastructure investment pipeline is deep enough to sustain continued scaling.
What is Feynman and why preview it now?
Feynman is NVIDIA's GPU architecture planned for 2028, two generations ahead of current Blackwell. Previewing it at GTC 2026 serves a strategic purpose: it signals to customers and competitors that NVIDIA's roadmap extends beyond the current generation and discourages waiting for a competitor to catch up.
No detailed specifications were released, which is standard for a two-year-out preview. The announcement is a commitment signal, not a product specification. It tells the market that the pace of architectural improvement will continue, which influences purchasing decisions today — customers are more willing to commit to Blackwell and Vera Rubin knowing that the upgrade path is defined.
What does GR00T N1.7 commercial licensing mean?
GR00T is NVIDIA's foundation model platform for humanoid robots. The N1.7 release comes with commercial licensing terms, meaning companies building robot products can deploy GR00T-based control systems without negotiating custom agreements.
This is NVIDIA's bet that the robotics market is about to follow the same trajectory as the AI software market: foundation models provide the base capability, and companies compete on application-specific fine-tuning and hardware integration. Commercial licensing lowers the barrier to entry for robotics startups and established manufacturers alike.
What should practitioners take away?
The GTC 2026 announcements collectively say one thing: the infrastructure layer beneath AI is about to get dramatically cheaper, faster, and more efficient. Teams making architecture decisions today should factor in that inference costs will drop significantly in 2027 as Vera Rubin deploys. Designs that are cost-prohibitive at current per-token pricing may become viable within 18 months. Plan for abundance, not scarcity.