Switching costs measured in years of engineering, not dollars
79-88% gross margins enabled by software lock-in + supply scarcity
Market share peaked at ~87% in 2024, declining to ~75% by 2026 as custom silicon gains
NVIDIA-Groq Acquisition
NVIDIA acquired Groq's assets for ~$20 billion in December 2025 (its largest deal ever). Structured as a "non-exclusive licensing agreement." GroqCloud continues operating independently.
AMD Competition
Chip
Memory
TDP
Status
MI300X
192 GB HBM3
750W
Shipping
MI325X
256 GB HBM3e
1000W
Shipping Q4 2024
MI350/MI355X
HBM3e
TBD
2025 (35x inference perf vs MI300)
MI400
HBM4
TBD
2026
ROCm: open-source, bi-weekly updates, supports 2M+ HuggingFace models. Still behind CUDA in maturity
DC AI revenue: $5B+ annually, targeting "tens of billions"
OpenAI planning AMD GPUs for training and inference in H2 2026
$10B AI infrastructure partnership with Saudi Arabia's HUMAIN
Helios reference design (2026): EPYC Venice + MI400 + Pensando Vulcano, up to 72 MI400 GPUs
Custom Silicon (Hyperscaler Chips)
Chip
Owner
Performance
Status
TPU Ironwood (v7)
Google
4.6 PFLOPS FP8/chip, 7.7 TFLOPS/watt
Shipping 2025. Superpods: 9,216 chips = 42.5 ExaFLOPS at 10MW
TPU Trillium (v6e)
Google
4.7x over v5e
GA, 100k+ chip deployments
Trainium3
Amazon/AWS
2.52 PFLOPS FP8, 144 GB HBM3e
2025 (TSMC 3nm). $10B+ annual run-rate
Maia 200
Microsoft
3x FP4 over Trainium3 (claimed)
TSMC 3nm, 140B+ transistors
MTIA v3-v5
Meta
Up to 25x compute gains (RISC-V)
2026 (TSMC N3, HBM3e). Inference-first, training to follow
Baltra
Apple
Co-designed with Broadcom
Mass production H2 2026. DC construction 2027
Custom ASICs: 50-70% lower cost per billion tokens vs H100 for training. Anthropic secured access to up to 1M Google TPU chips.
AI Chip Startups
Company
Approach
Status
Valuation/Funding
Cerebras
Wafer-scale (WSE-3: 4T transistors, 900K cores)
Re-filing for Q2 2026 IPO
$23B valuation, $1B raised
Groq
LPU (deterministic inference)
Acquired by NVIDIA for $20B
GroqCloud continues independently
SambaNova
Reconfigurable dataflow (SN50 chip)
Intel acquisition fell through; $350M Series E
Multi-year Intel partnership
Graphcore
IPU
Acquired by SoftBank (July 2024)
Part of SoftBank AI strategy
Tenstorrent
RISC-V + AI (Jim Keller)
Active across DC, auto, robotics, edge
$6.9B valuation, $1B+ raised
Etched
Transformer-only ASIC (Sohu)
TSMC-supported, HBM3e
$5B valuation, $620M raised
d-Matrix
Digital in-memory compute (Corsair/Raptor)
Raptor: first 3D-stacked DRAM accelerator
$2B valuation, $450M raised
Networking: InfiniBand vs Ethernet
Technology
Latency
Speed
Best For
InfiniBand
~1-2 µs
Quantum-X800: 800Gbps x 144 ports = 115.2 Tbps
Large-scale training. NVIDIA-controlled
NVIDIA Spectrum-X
Low
800G Ethernet. 760% YoY revenue growth to $1.46B
Ethernet for AI. Customers: Meta, Oracle
UEC 1.0
Near-IB
Rebuilt from ground up (not just RoCE v2)
Open standard. AMD, Arista, Broadcom, Cisco, HPE, Intel, Meta, Microsoft
Trend: Ethernet gaining ground. InfiniBand retains edge for largest training runs. By mid-2025, Ethernet leads for many AI back-end deployments.
Memory & Storage
HBM Market Share (Q2 2025)
Manufacturer
Share
Status
SK Hynix
62%
All 2026 supply sold out
Micron
21%
Meeting only 55-60% of demand
Samsung
17%
Catching up, ramping HBM4
Supply Crisis
HBM is the single tightest component in the AI stack
SK Hynix: "We have already sold out our entire 2026 HBM supply"
Micron: capacity crunch "likely to persist beyond 2026"
6-12 month lead times; prices rising ~20% for 2026 contracts
2026 revenue mix: ~55% HBM4, 45% HBM3e
Server OEMs
Rank
OEM
Revenue/Share
Notes
1
Dell
$12.5B, 20% share
Leading at neoclouds (CoreWeave, Tesla, xAI)
2
Supermicro
$11.7B, ~9.5%
Speed-to-market, liquid cooling expertise
3
HPE
15% share
Enterprise-focused
4
Inspur
12% share
Dominant in China
5
Lenovo
11% share
Growing AI server portfolio
Enterprise AI server market: $245B in 2025, projected $524B by 2030. ODMs (Quanta, Wiwynn, FII) handle bulk of hyperscale volume at ~2-3% margins.
Power per Chip
Chip
TDP
Performance
Perf/Watt Trend
H100
700W
~3,958 TFLOPS FP8
Baseline
B200 (full-spec)
1000-1200W
~5x H100 FP4
~3-4x improvement
GB200 Superchip
2700W
Massive training throughput
Best perf/watt at scale
Google Ironwood
600W
4.6 PFLOPS FP8
7.7 TFLOPS/watt (2x Trillium)
AMD MI300X
750W
Competitive with H100
Similar
Each generation delivers ~2-3x better perf/watt. GB200 NVL72 rack: 120kW, requires liquid cooling (20 L/min). See Power: Rack Density.
Supply Chain
TSMC Dependency & CoWoS Bottleneck
TSMC manufactures virtually all leading-edge AI chips. CoWoS advanced packaging is the critical bottleneck.
Period
Monthly Capacity
Status
Late 2024
~35,000 wafers/month
Fully booked
End 2025
~65,000 (doubled)
Still sold out (demand up 113% YoY)
End 2026
~130,000 (target)
NVIDIA has 50%+ booked
Geopolitical Risks
Taiwan concentration: Vast majority of CoWoS in Taiwan — single largest strategic vulnerability
TSMC accelerating Arizona advanced packaging; also building in Japan (Kumamoto) and Germany (Dresden)
US CHIPS Act funding supports domestic capacity but won't match Taiwan for years