This article is from the free weekly Barron’s Tech email newsletter. Sign up here to get it delivered directly to your inbox.
GPU Supremacy. Hi everyone. When
Taiwan Semiconductor Manufacturing
reported earnings last month, the world’s largest contract chip manufacturer said demand for nearly every product category had weakened except for one: AI chips.
TSMC was primarily talking about the graphics processing units [GPUs] it makes for
Nvidia
(ticker: NVDA), which dominates the market for semiconductors used for AI applications. The rising excitement over generative artificial intelligence has created a shortage of
Nvidia’s
high-end H100 GPUs that are best suited for the parallel computations needed to train AI models and serve customers.
Nvidia’s products have now become the technology industry’s most precious resource. Corporations and start-ups are frantically shifting budget priorities to new artificial intelligence projects and clamoring for GPUs. “Demand is outstripping supply [for Nvidia GPUs],” Amazon Web Services CEO Adam Selipsky said during an interview with The Verge this week, while emphasizing that AWS was the best place for customers to get continuous access to Nvidia GPU capacity.
While AWS has excellent visibility into the world of cloud computing, there’s another company that may actually have a better view of what’s happening on the front lines of AI: cloud service provider CoreWeave.
CoreWeave, which was founded in 2017, provides large scale GPU capacity via the cloud to start-ups and larger enterprises. The company has a close partnership with Nvidia and was one of the first vendors to bring its H100 GPUs to market. Nvidia invested in CoreWeave in April through the start-up’s $221 million Series B funding round.
Barron’s Tech recently spoke to CoreWeave co-founder and CTO Brian Venturo about discuss GPU technology, the state of the market, Nvidia’s chip ecosystem, and the most significant risk for AI start-ups.
Here are edited highlights from our conversation with Venturo:
Barron’s: Describe CoreWeave’s business and your typical customer?
Venturo: CoreWeave is a specialized cloud provider. We’re not here to host your website. We are built to serve GPU accelerated compute to large scale users of artificial intelligence, machine learning, real time rendering visual effects, and for life sciences work. It’s typically for customers who have significant parallel computations tasks that they need to run.
What is happening in the GPU market? When exactly did the AI demand boom start?
In the first quarter of this year, it was still pretty easy to secure [GPU] allocation and capacity in the supply chain. Starting in early April is when the market got incredibly tight. The lead times went from reasonable, to the end of the year. And [that shift] happened in one week. It wasn’t just cloud service providers, they already had their allocations. This was all incremental demand. It was from large enterprises and AI labs.
Nvidia’s top-of-the-line H100 is reportedly nearly impossible to purchase in the current environment. When can customers get access to H100 GPUs and capacity if they buy today?
Anybody who is being reasonable in their logistics and resource planning is looking at Q1 2024 to Q2 2024 now. We’re starting to make purchases for our deployments in Q2 and Q3 of next year.
Why are customers clamoring for Nvidia AI chips versus offerings from Advanced Micro Devices and cloud vendors?
Nvidia’s moat is twofold. First is on the hardware side. Nobody makes chips as well as Nvidia does. Second is the software. Time to market is incredibly important for start-ups. If you are required to retool your entire tech stack to use AMD or a TPU [Tensor Processing Unit from Google], that is valuable time where you could lose your market opportunity.
Nvidia was incredibly prescient when they invested so heavily in the CUDA [software programming platform] ecosystem. They are basically 10 years ahead of everyone else now. It’s not 10 years of just Nvidia, but their customers and developers building on that ecosystem [with software tools and libraries], leveraging everyone else’s prior work. I don’t see anyone else overcoming Nvidia in the short term or even medium term.
The problem with using Google’s TPU and AWS Trainium accelerators is being locked into a vendor with a very specific technical solution. It is probably not the best choice to make as a start-up. You want to have vendor flexibility knowing you can get the same thing at multiple places.
[Nvidia’s proprietary networking] InfiniBand also offers the best solution today to minimize latency. Other offerings doesn’t have the congestion control and features to make workloads perform the best.
What are AI start-ups telling you about their growth plans?
There’s a lot of worry from AI start-ups that there may not be enough GPUs available to serve inference [the process of generating answers from AI models] when they find commercial success. To me, it’s exciting as an infrastructure provider. But from a start-up strategy perspective, getting guaranteed access to that compute becomes almost a binary business risk.
Thanks for your time, Brian.
This Week in Barron’s Tech
Write to Tae Kim at [email protected] or follow him on Twitter at @firstadopter.
Read the full article here