STRUCTURAL SHIFT

The Scale-Up Fabric Wars

The fight over how GPUs talk to each other inside a pod is the single most important standards war in AI infrastructure.

Status: UALink 2.0 spec released April 2026 → First products Q4 2026 → Structural by 2028

What's Technically Happening

AI training requires two distinct layers of networking. "Scale-up" is the intra-pod fabric connecting GPUs that must share memory and synchronize state at extremely low latency — typically 16 to 1,024 accelerators operating as one logical unit. "Scale-out" is the inter-pod fabric connecting many pods into a full cluster, typically Ethernet-based.

NVIDIA dominates scale-up via NVLink, a proprietary memory-semantic fabric that allows GPUs to access each other's memory with near-zero software overhead. The latest NVLink generation delivers 1.8 TB/s per GPU bidirectional. NVLink Switch enables fully-connected GPU pods of up to 72 GPUs today (Blackwell NVL72), expanding to 576 GPUs in Rubin Ultra Kyber. NVLink is closed: you cannot buy a non-NVIDIA chip that talks NVLink.

UALink (Ultra Accelerator Link) is the open-standard counter-attack. Backed by AMD, Intel, Google, Amazon, Microsoft, Meta, Apple, Alibaba, Cisco, HPE, Astera Labs, Synopsys, and others. UALink 1.0 supports 1,024 endpoints per fabric at 200 Gb/s per lane. The UALink 2.0 specification was released in April 2026. Upscale AI is targeting Q4 2026 for the first UALink scale-up switch to ship. UALink is memory-semantic like NVLink — direct load, store, and atomic operations between accelerators.

Broadcom initially joined UALink, then withdrew to pursue its own Scale-Up Ethernet (SUE) standard. Broadcom's rationale: leverage its Tomahawk switch silicon dominance to own both scale-up and scale-out on the same Ethernet fabric, with custom extensions for GPU-to-GPU memory semantics. Direct bet that Ethernet + extensions can displace both NVLink and UALink.

Scale-out is a separate protocol battle: Ultra Ethernet (UEC), an industry-consortium standard, is emerging as the dominant choice. Arista, Cisco, and Broadcom are all shipping Ultra Ethernet silicon and systems. InfiniBand (owned by NVIDIA via Mellanox) retains share in legacy HPC but is losing ground to Ethernet on cost and ecosystem grounds.

The stakes: whichever scale-up fabric wins becomes the next industry software moat. If UALink succeeds, every hyperscaler's custom ASIC becomes interoperable at the scale-up layer — breaking NVIDIA's closed-fabric advantage. If SUE wins, Broadcom captures both layers of the fabric on one silicon line. If NVLink holds, NVIDIA's moat survives the custom silicon wave. All three outcomes are plausible in 2026.

In Plain English

Inside every big AI training cluster, there are actually two different computer networks. One is short-range — it connects GPUs that live in the same rack or row and need to talk to each other thousands of times per second. That's called "scale-up." The other is long-range — it connects entire rows or buildings full of equipment into one logical cluster. That's called "scale-out."

NVIDIA owns the scale-up layer today. Their version is called NVLink, and it's a proprietary protocol — you can only use it if you buy NVIDIA chips and NVIDIA switches. It's one of the main reasons NVIDIA is so hard to displace. Even if you build your own AI chip that's just as fast as an NVIDIA GPU, you still can't stitch your chips together into a large pod the way NVLink lets NVIDIA do. You're stuck running smaller clusters, which is a big disadvantage for training the biggest models.

So every major hyperscaler — Google, Amazon, Microsoft, Meta, plus AMD and Intel — got together and said: "We need an open standard that does what NVLink does, but works with any chip." That standard is called UALink. The 2.0 specification was published this month (April 2026). The first actual switch products that support it are scheduled to ship in late 2026. If UALink works as advertised, it breaks NVIDIA's fabric monopoly: any hyperscaler's custom chip can suddenly be wired into a large pod using standard switches from any vendor, not just NVIDIA.

There's a twist. Broadcom, one of the biggest networking chip companies in the world, originally signed on to UALink and then quietly walked away. Broadcom had a different idea: instead of creating a whole new protocol, just take Ethernet — the universal standard that already runs the internet — and add some extra features to make it work as a scale-up fabric too. Broadcom calls this Scale-Up Ethernet, or SUE. If it works, Broadcom owns both the short-range and long-range fabric with the same switch chips, which would be a commercially enormous position.

So now there are three horses in the race for scale-up: NVIDIA's NVLink (closed, already shipping, works today), UALink (open, almost shipping, backed by everyone except NVIDIA and Broadcom), and Broadcom's SUE (also almost shipping, trying to turn Ethernet into a universal fabric). Whoever wins determines whether NVIDIA's moat survives into the next decade — and it determines which networking companies get a much bigger slice of future AI infrastructure spend.

Who Benefits Most

Beneficiaries are ranked by the directness of their exposure. Tickers that exist in our explorer link to the company brief.

Primary beneficiaries

Direct, first-order exposure. If the trend plays out, these are the names that capture the majority of the value.

AVGOBroadcom

Broadcom. Hedged position: shipping Tomahawk for scale-out Ethernet AND pushing SUE for scale-up. Wins in almost every outcome except a pure NVIDIA victory. Custom ASIC design services business also benefits from the UALink ecosystem. Best overall positioned.

ANETArista Networks

Arista Networks. Dominant pure-play data center networking. Ultra Ethernet beneficiary for scale-out. If UALink wins scale-up, Arista is positioned to ship UALink switches alongside its existing fabric.

ALABAstera Labs

Astera Labs. Small-cap pure-play. Makes connectivity silicon (retimers, fabric chips) that sit inside both UALink and Ultra Ethernet switches. Highest beta to the trend.

MRVLMarvell Technology

Marvell. Custom silicon partner and fabric chip supplier. DPU and interconnect exposure across all three protocol outcomes.

Secondary beneficiaries

Real exposure but competing with alternatives or dependent on adjacent calls.

CRDOCredo Technology

Credo Technology. Active electrical cables and DSPs for short-reach fabric links. Wins on both UALink and SUE — connectivity is protocol-neutral.

CSCOCisco Systems

Cisco. UALink consortium member. Positioned to ship UALink gear but execution history in AI fabric is weaker than Arista.

COHRCoherent Corp

Coherent. Indirect beneficiary. More pods and more fabric means more optical transceivers feeding both scale-up and scale-out.

LITELumentum Holdings

Lumentum. Similar to Coherent — optical content grows regardless of which scale-up protocol wins.

Picks and shovels

Enabling suppliers whose revenue scales with the trend regardless of which frontline vendor wins.

HPEHewlett Packard Enterprise

Hewlett Packard Enterprise. UALink consortium member. Systems integrator positioned to ship UALink-enabled reference designs to enterprise.

SNPSSynopsys

Synopsys. UALink consortium member. EDA IP blocks for UALink PHY and controller designs used inside every compliant chip.

←

Previous trend

8. The 800V Power Architecture

All trends→