The Era of Gigawatt-Scale AI Data Centers (2026)

THE ERA OF GIGAWATT‑SCALE AI DATA CENTERS (2026)

A Complete Research Report with Diagrams & Company Positioning

1. Executive Summary

By 2026, AI data centers have crossed into gigawatt‑scale industrial infrastructure. Clusters that once held 2,000–8,000 GPUs now exceed 100,000 GPUs per site, with only 5–7 such clusters operational globally.

This report explains:

Why building an AI data center is far more complex than “buy GPUs”
The rise of 100k+ GPU clusters
The networking revolution (AEC, optics, CXL, PCIe 6)
Cooling and energy constraints
The global construction boom (831 sites, 23.1 GW)
Where Celestica (CLS), Astera Labs, and Vertiv fit in the stack

2. Why Building an AI Data Center Is Hard

At first glance, it seems simple:

“Buy NVIDIA GPUs and plug them in.”

In reality, a hyperscale AI cluster requires:

GPUs
HBM memory
PCIe/CXL connectivity
AEC/DAC/optical cables
Switches (800G → 1.6T)
Racks, power distribution, cooling loops
Substation‑scale electrical infrastructure

Here is the real architecture:

+-------------------------------------------------------------+

|                     AI DATA CENTER STACK                    |

+-------------------------------------------------------------+

|  AI MODELS / TRAINING FRAMEWORKS (PyTorch, JAX, Megatron)   |

+-------------------------------------------------------------+

|  GPU SERVERS (H100, B200, MI450, Rubin)                     |

|  - HBM3E / HBM4 memory                                      |

+-------------------------------------------------------------+

|  IN-RACK CONNECTIVITY                                       |

|  - PCIe 5/6                                                 |

|  - CXL 3.0 memory pooling                                   |

|  - AEC / DAC / Optics                                       |

+-------------------------------------------------------------+

|  FABRIC SWITCHING                                           |

|  - 800G → 1.6T Ethernet                                     |

|  - Leaf / Spine / Super-Spine                               |

+-------------------------------------------------------------+

|  POWER + COOLING                                            |

|  - Liquid cooling (DLC, immersion)                          |

|  - CDUs, pumps, heat exchangers                             |

|  - UPS, switchgear, substations                             |

+-------------------------------------------------------------+

|  CAMPUS INFRASTRUCTURE                                      |

|  - 100–1,500 MW power                                       |

|  - Water, fiber, buildings                                  |

+-------------------------------------------------------------+

3. The World’s Largest AI Clusters (as of April 2026)

xAI “Colossus” — Memphis, TN

100,000 NVIDIA Hopper GPUs
Fully liquid‑cooled (Supermicro)
Massive AEC deployment (Credo)

Meta “Grand Teton”

Two 100k clusters (US + EU)
Target: 600,000 H100‑equivalent GPUs by end of 2026

Microsoft/OpenAI “Phase 4”

Multiple 100k “Eagle” clusters
Built for GPT‑5 and successors

Tesla Dojo + H100

Dojo rebooted in 2026
50k H100 cluster → expanding to 100k+

4. The Next Wave Under Construction

Microsoft “Stargate”

$100B project
Three campuses: Abilene, West Virginia, Wisconsin
Each: 1.2 GW
Will house millions of GPUs

Nscale (Norway + West Virginia)

1.35 GW deal with Microsoft (April 2026)
Deploying NVIDIA Vera Rubin
Requires 1.6T AEC connectivity

Meta “Helios”

$100B AMD MI450 deal
Open‑standard racks (OCP)

5. Global Construction Pipeline

+-------------------------------+

|   GLOBAL AI DC PIPELINE 2026 |

+-------------------------------+

| Sites under construction: 831 |

| Total power:        23.1 GW   |

| US share:           15.9 GW   |

+-------------------------------+

| 23 GW = power for 17M homes   |

+-------------------------------+

This is the largest infrastructure buildout since the creation of the electrical grid.

6. Memory: The Real Bottleneck

HBM is now the limiting reagent of AI compute.

HBM3E: 5–8 TB/s
HBM4 (2027): 10+ TB/s
Supply fully sold out through 2026
40–60% of GPU BOM cost

7. Networking & Interconnects

7.1 Why Copper Still Dominates Inside the Rack

Technology	Material	Distance	Limitation
Passive DAC	Copper	< 2 m	Too short at 800G
Optical (AOC/Fiber)	Glass + lasers	10 m – 10 km	Expensive, hot, high power
AEC (Active Copper)	Copper + DSP	3 – 7 m	Perfect for rack‑scale

Why AEC exploded: hyperscalers hit a wall:

Passive copper: too short
Optics: too expensive + too hot
AEC: the only viable solution

Digital vs Analog AEC

Feature	Credo (Digital AEC)	Semtech (Analog ACC)
Signal processing	Full DSP + re‑clocking	Analog equalization
Max distance	~7 m	~3–4 m
Power consumption	Higher	Lower
Cable thickness	Thinner	Thicker

8. Switches (800G → 1.6T)

Switches are the central nervous system of a 100k GPU cluster.

Vendors: NVIDIA, Broadcom, Arista, Cisco
Power per switch: 5–15 kW
A 100k GPU cluster may require 10,000+ switches

9. Cooling

GPUs now draw:

H100: ~700W
B200: ~1000W
Rubin: ~1200W+

Air cooling is effectively dead at this density.

Cooling Methods

Method	Notes
Direct Liquid Cooling	Standard for 2025–2026
Rear‑Door Heat Exchangers	Retrofit option
Immersion Cooling	Highest efficiency, more complex
Two‑Phase Cooling	Future standard, still early

Cooling is now 30–40% of total data center cost.

10. Energy Consumption

A single 100k GPU cluster consumes:

150–300 MW continuous load

Gigawatt campuses:

1.2–1.5 GW each

PUE for AI data centers:

1.1–1.2 (thanks to liquid cooling)

11. The Infrastructure Supercycle (2025–2030)

2025–2026 → TRAINING ERA

- 10 mega-clusters (100k GPUs each)



2027–2030 → INFERENCE ERA

- Thousands of smaller clusters

- Deployed in hundreds of cities

Total expected spend: $3 trillion by 2030.

12. Where CLS, Astera Labs, and Vertiv Fit

12.1 Position in the AI Data Center Stack

Layer / Function	Celestica (CLS)	Astera Labs	Vertiv
GPUs / ASICs
Memory / HBM		(x)
In‑rack connectivity (PCIe/CXL/AEC)	(x)	X
Switches / Fabric	X	(x)
Racks / Servers / Integration	X		(x)
Power & Cooling			X
Data center construction			X

X = primary exposure, (x) = secondary / indirect exposure.

Interpretation:

Astera Labs lives inside the rack (PCIe/CXL retimers, CXL memory pooling).
Celestica (CLS) sits at the fabric and server layer (switches, servers, ODM integration).
Vertiv anchors power, cooling, and prefabricated data center modules.

12.2 Leverage to the AI Supercycle

Company	Segment Focus	Tie‑in to 100k GPU Clusters	Sensitivity to GPU Shipments	Capital Intensity	Moat Type
Celestica	Switches, servers, ODM integration	High (fabric + racks)	High	Medium–High	Scale, integration, relationships
Astera Labs	PCIe/CXL retimers, memory pooling	Very high (per‑link scaling)	High	Low–Medium	IP, DSP know‑how, design‑ins
Vertiv	Power, cooling, prefabs	Medium (MW‑driven)	Medium	High	Installed base, service network

13. Final Summary

AI data centers have become power plants for computation and the backbone of a new global infrastructure buildout.

Astera Labs wins inside the rack (PCIe/CXL, memory connectivity).
Celestica wins in switches, servers, and integration for hyperscale fabrics.
Vertiv wins in power, cooling, and gigawatt‑scale campuses.

Together, they form the hidden infrastructure layer powering the AI revolution.

Search This Blog

Stocks