OpenAI's first custom chip cuts inference costs by roughly 50%
OpenAI unveiled its first custom chip, built with Broadcom in just 9 months. Early results show roughly 50% lower inference costs than typical AI GPUs.

Broadcom CEO Hock Tan and President Charlie Kawwas deliver the first Jalapeño chip sample to OpenAI's Sam Altman
- 50% cheaper inference, early data shows – Broadcom's CEO says Jalapeño shows roughly 50% cost savings versus typical AI GPUs in early testing.
- Built in 9 months flat – OpenAI and Broadcom call it the fastest ASIC development cycle ever achieved in high-performance semiconductors.
- Microsoft may buy 40% of supply – Broadcom has reportedly asked Microsoft to guarantee it purchases 40% of Jalapeño's first production run.
OpenAI just delivered on something it's been quietly building toward for eight months: its own silicon. On Wednesday, OpenAI and Broadcom unveiled Jalapeño, OpenAI's first custom AI accelerator, and according to Broadcom CEO Hock Tan, early testing shows roughly 50% cost savings compared with typical AI GPUs running the same inference workloads.
The chip was delivered in person, Broadcom's Hock Tan and Charlie Kawwas handed a physical sample to Sam Altman and Greg Brockman on Wednesday, eight months after the two companies first announced a deal to co-develop 10 gigawatts of custom AI accelerators. What makes the timeline notable: Jalapeño went from initial design to manufacturing tape-out in roughly nine months, which both companies describe as the fastest ASIC development cycle ever achieved in high-performance semiconductors.
Jalapeño is built specifically for inference, running already-trained AI models in response to user requests, rather than the training process that builds those models in the first place. That's a deliberate, narrow target. OpenAI emphasized the chip's low operating cost specifically when running real-time coding models, and pre-training will likely keep relying on Nvidia hardware for now. The split matters because inference, not training, is where OpenAI actually spends money every single day serving its products to users, and unlike training cost, inference cost scales directly with usage.
"We have a deep understanding of the workload. We've really been looking for specific workloads that are underserved, and asking how can we build something that will be able to accelerate what's possible." - Greg Brockman, President, OpenAI
What this means: every additional ChatGPT or Codex query currently costs OpenAI money to serve on rented or purchased GPU capacity. If Jalapeño actually delivers anywhere close to the 50% cost reduction Broadcom is claiming, OpenAI's largest recurring expense gets meaningfully cheaper, and that's before accounting for the leverage of not depending entirely on Nvidia for every chip in its stack.
The business stakes extend well beyond OpenAI. Broadcom has reportedly asked Microsoft to guarantee it will purchase 40% of the chips to secure the first production phase, tying Microsoft's own AI infrastructure plans directly to Jalapeño's success. Hock Tan told CNBC that demand from Broadcom's six largest customers is "simply insatiable," extending years beyond 2026 into elevated demand through 2028. Initial deployment is targeted for late 2026 at gigawatt scale, expanding across multiple chip generations after that.
The numbers being shared right now come with a real caveat. The performance-per-watt figures are self-reported by OpenAI and haven't been independently finalized, it's not yet public which competing chips Jalapeño was benchmarked against, on what specific tasks, or under what conditions. A detailed technical report is promised in the coming months, which will be the actual test of whether the 50% cost claim holds up outside a press release.
What happens next matters more than the announcement itself. Engineering samples are already running real workloads in the lab, including OpenAI's GPT-5.3-Codex-Spark model, but "engineering sample" and "shipped at gigawatt scale" are very different milestones. The real signal to watch for is whether Microsoft actually commits to that 40% purchase guarantee Broadcom is asking for, if the company that runs the largest share of OpenAI's existing infrastructure backs Jalapeño at volume, that's the strongest evidence yet that custom silicon, not just rented Nvidia capacity, is where the AI infrastructure race is actually heading.


.png%3Falt%3Dmedia%26token%3Dbb37b4b1-53f6-46bf-87ce-dcae1f71c153&w=3840&q=75)
.png%3Falt%3Dmedia%26token%3D6804dd41-3b29-41d7-839e-b8d6c552e96c&w=3840&q=75)