Custom AI ASICs are projected to capture 27.8% of server shipments in 2026, signaling a structural shift away from merchant GPUs as hyperscalers prioritize workload-specific silicon.
Market Dynamics
Nvidia retains roughly 70% of the AI chip market, but this share is eroding as Google, Amazon, Meta, Microsoft, and OpenAI invest billions in purpose-built ASICs. Custom ASIC shipments are forecast to grow 44.6% year-over-year in 2026, nearly triple the 16.1% growth rate for merchant GPUs. TSMC enables this ecosystem, fabricating chips for all five hyperscalers and Broadcom, the dominant custom AI chip architect.
Broadcom carries a $73 billion AI backlog and targets $100 billion in annual AI chip revenue by 2027. Marvell, partnered with Amazon on Trainium and Microsoft on Maia, projects up to $11 billion in AI ASIC revenue for 2026. Together, Broadcom and Marvell control roughly 95% of the custom AI ASIC co-design market.
Broadcom and the XPU Ecosystem
Broadcom reported $8.4 billion in AI semiconductor revenue for Q1 FY2026, up 106% year-over-year, with Q2 guidance of $10.7 billion. CEO Hock Tan disclosed “line of sight” to exceed $100 billion in AI chip revenue by 2027, backed by a $73 billion backlog. The company has six major XPU customers, including Google, its longest-standing partner with seven generations of co-designed TPUs since 2014.
OpenAI signed a multi-year collaboration in October 2025 for 10 gigawatts of custom accelerators, with first deployment targeting the second half of 2026 using 3nm and 2nm designs. Meta, ByteDance, and Fujitsu round out confirmed customers; analysts identify Apple and Arm/SoftBank as potential future engagements. Broadcom’s 3.5D XDSiP platform uses TSMC’s SoIC and CoWoS processes, enabling packages exceeding 6,000 mm² of silicon with up to 12 HBM stacks.
Hyperscaler Silicon Roadmaps
Google’s TPU v7, codenamed Ironwood, delivers 4,614 FP8 TFLOPS per chip with 192 GB HBM3E on TSMC’s N3P process. Its 9,216-chip superpod achieves 42.5 FP8 exaflops, and SemiAnalysis estimates TPUs achieve roughly 90% sustained model FLOP utilization versus 70-80% for GPUs, narrowing real-world performance gaps. Google claims a 44% lower total cost of ownership per Ironwood chip versus a GB200 server.
Amazon’s Trainium3, AWS’s first 3nm chip, delivers 2.517 PFLOPS FP8 with 144 GB HBM3E. The Trn3 UltraServer packs 144 chips for 362 FP8 petaflops, a 4.4x improvement over its predecessor. AWS has deployed over 1 million Trainium processors, with Trainium4 promising three times FP8 performance and four times memory bandwidth over Trainium3, plus support for Nvidia NVLink Fusion.
Meta disclosed four new MTIA generations (300 through 500) for deployment through 2027. The MTIA 500, scheduled for 2027, scales to 10 PFLOPS FP8 and 30 PFLOPS MX4 with up to 512 GB HBM. Meta has deployed hundreds of thousands of MTIA chips for inference but emphasizes MTIA is not a replacement for Nvidia GPUs, having expanded its Nvidia partnership for “millions of AI chips.”
Forward Outlook
The custom AI ASIC market is entering a phase of rapid maturation, driven by hyperscaler demand for workload-optimized silicon and Broadcom’s architectural leadership. As ASIC shipments approach nearly a third of the market, the competitive balance between merchant GPUs and purpose-built accelerators will define the next phase of AI infrastructure investment. Enterprises should monitor these developments closely, as they will directly influence cost structures, performance benchmarks, and supply chain dynamics through 2027 and beyond.
