The Real Winners of the AI Boom Aren't the AI Companies — They're the Infrastructure Layer | IRCNF - Intelligent Reliable Custom Next-gen Frameworks

Every gold rush makes millionaires out of shovel sellers. The 2024–2026 AI supercycle is following the same pattern — while OpenAI, Anthropic, and Google argue over which model is smartest, the companies selling GPUs, networking, power, cooling, and cloud infrastructure are posting record revenue with fat margins. Here's the infrastructure layer that's actually printing money.

Nvidia's Sustained Dominance

Nvidia's data center revenue hit $35.6B in a single quarter (Q1 FY2026) — a number that would have seemed fictional five years ago. The transition from H100/H200 to the Blackwell architecture is well underway, with GB200 NVL72 rack systems commanding $3M+ per rack and still backordered by quarters.

The key engineering leap in Blackwell is NVLink 5, which delivers 1.8 TB/s GPU-to-GPU bandwidth. This allows a 72-GPU inference cluster to behave like a single massive accelerator — critical for large-model serving where inter-GPU communication latency was previously a hard wall. Demand continues to exceed supply by a wide margin, giving Nvidia extraordinary pricing power.

AMD's MI300X is a genuine competitor — it has cleared a $5B+ annual revenue run rate and is winning meaningful deployment at hyperscalers and cloud providers. But Nvidia's CUDA ecosystem, built over 15 years of developer tooling, framework integration, and software libraries, remains the primary moat. Switching from CUDA to ROCm requires real engineering investment, and most AI teams aren't making that trade unless the cost savings are overwhelming.

The Networking Play — Ethernet vs InfiniBand

Networking is the invisible bottleneck in AI cluster builds. InfiniBand, originally developed for HPC and now controlled by Nvidia (via the 2020 Mellanox acquisition), dominates high-performance AI training clusters. Its low-latency, high-throughput fabric is purpose-built for the all-reduce operations that distributed training depends on.

But hyperscalers are pushing back. Google, Microsoft, and Meta are building out Ethernet-based AI networks using the Ultra Ethernet Consortium (UEC) specification — a collaborative effort to bring InfiniBand-level performance to standard Ethernet at lower cost and without vendor lock-in. This creates a major opportunity for Arista Networks (high-radix switches), Broadcom (Tomahawk 5 ASIC, which pushes 51.2 Tbps per chip), and Cisco.

Broadcom has projected $60B+ in cumulative AI networking ASIC revenue by 2027 — a figure that reflects both organic growth and the hyperscaler Ethernet transition. Custom silicon is accelerating the same trend: Google's TPUs, AWS Trainium 2, and Microsoft's Maia 100 are all reducing dependence on Nvidia for training workloads, while funneling spend toward their own silicon and the networking vendors that connect it.

Power and Cooling — The Overlooked Bottleneck

A single GB200 NVL72 rack draws 120 kW of power. A 1,000-GPU cluster sustains 1.67 MW continuously — roughly the power draw of 1,400 average US homes, running 24/7. At that density, the constraint isn't GPUs anymore. It's power delivery and thermal management.

Traditional air-cooled data centers top out around 20-30 kW per rack. AI-optimized facilities need direct liquid cooling (DLC) — coolant pipes running directly to the server chassis, removing heat at the source. Vertiv is one of the primary beneficiaries, supplying liquid cooling systems and precision cooling infrastructure globally. Eaton provides uninterruptible power supplies (UPS) and power distribution units (PDU) at the rack and row level. Bloom Energy's on-site fuel cells are increasingly deployed to supplement grid power at AI-scale facilities.

Data center REITs are benefiting from the structural supply shortage. Equinix and Digital Realty are building AI-optimized campuses from the ground up — designed for 50-100 kW per rack densities with DLC baked in from the foundation. The key dynamic: a new 100 MW AI-ready data center takes 18–24 months to permit, design, and construct. That backlog creates significant pricing power for existing operators with available capacity today.

The Cloud Hyperscalers' Capex Surge

The scale of hyperscaler spending is difficult to overstate. Microsoft has committed to $80B in capital expenditure in 2026, primarily for AI infrastructure. Google has guided to $75B. Amazon's figure is the largest at $105B. These aren't marketing commitments — they show up in quarterly earnings as concrete construction and equipment spend.

The money flows downstream to a concentrated set of suppliers. TSMC manufactures the H20 and B200 chips and handles the advanced CoWoS (Chip-on-Wafer-on-Substrate) packaging that stacks HBM memory directly onto the GPU die. CoWoS capacity has been a reported bottleneck on GB200 production — TSMC has been operating at full utilization and expanding capacity at a pace that still trails demand.

SK hynix and Samsung supply HBM3 and HBM3E, the high-bandwidth memory that makes modern AI accelerators possible. ASML provides the EUV lithography machines without which cutting-edge chips cannot be manufactured. The entire supply chain is running hot — and because each link takes years to expand, the pricing power of constrained suppliers will persist well into 2027.

The Startup Layer — Infrastructure Picks and Shovels

Below the hyperscaler tier, a wave of infrastructure startups is capturing the demand that Amazon, Microsoft, and Google can't or won't serve:

CoreWeave: The most-watched GPU cloud startup. Raised $11.9B at a $23B valuation in 2025, built on a fleet of Nvidia H100s rented to AI companies at premium rates. Profitable on a per-GPU basis and expanding aggressively into Blackwell hardware.
Lambda Labs: AI-focused GPU cloud with $320M raised. Targets researchers and mid-market AI teams who can't get hyperscaler quotas — a real problem given AWS and Azure waitlists.
Together AI: Inference API startup specializing in optimized multi-model serving. Offers access to open-weight models (Llama, Mistral, etc.) at competitive per-token pricing with a focus on throughput.
Modal: Serverless GPU compute for developers. Abstracts away cluster management — you write Python, Modal handles provisioning, scaling, and billing per second of actual GPU use.
Groq: Built the LPU (Language Processing Unit), a purpose-designed chip for inference. Claims 500+ tokens/second on Llama-class models — significantly faster than GPU-based inference at equivalent cost for certain workloads.
Cerebras: Wafer-scale chip architecture that packs an entire silicon wafer into a single processor. Recently filed for IPO. Strong positioning for training workloads where model size exceeds single-GPU memory limits.

The Valuation Math

Infrastructure companies in the AI cycle are trading at revenue multiples 2–3x higher than historical software benchmarks — and for defensible reasons. AI infrastructure is scarce (supply-constrained), capital-intensive (high barriers to entry), and sticky (switching costs are real). These are the conditions that justify premium multiples.

CoreWeave's $23B valuation on approximately $4B ARR implies a 5–6x revenue multiple. That sounds rich until you compare it to AWS, which trades at roughly 7x on a more diversified, mature business. The infrastructure layer may actually be better positioned than the model layer over a 3–5 year horizon: models commoditize as open-weight alternatives close the gap, but compute doesn't. The cost of a GPU-hour doesn't fall just because a new LLM arrives.

Conclusion

The AI boom is real, and the capex wave is just beginning — hyperscaler spend is accelerating, not plateauing. But the safest bets in this cycle aren't on which LLM wins the next benchmark. They're on the companies that get paid regardless of who wins.

Power infrastructure, AI networking ASICs, custom silicon, HBM memory, EUV lithography, and specialized GPU clouds all benefit from the capex wave whether GPT-5, Claude 4, or Gemini Ultra dominates in 2027. The model companies are burning capital to differentiate. The infrastructure layer is collecting rent.