ClusterMax, InferenceMax & the Token Efficiency Race | Dylan Patel at Aria Networks Launc
SemiAnalysis founder Dylan Patel breaks down the explosive growth of AI infrastructure, the rise of ClusterMax and InferenceMax ...
Founder, CEO, and Chief Analyst, Semianalysis
Search every verified Dylan Patel interview, podcast appearance, and on-the-record quote — each transcript cross-checked by AI and human review to confirm speaker identity. Dylan Patel, founder and CEO of SemiAnalysis, has been speaking at several industry events in early 2026 about AI infrastructure, benchmarking, and market dynamics. At an Aria Networks launch event in April, Patel stated that AI inference demand has grown so rapidly that the rental price of three-year-old H100 GPUs has risen from around $160-170 per hour to over $240 per hour in six months, with no spare capacity available. He also discussed the InferenceX project, which he described as a free and open-source benchmarking effort with over a thousand GPUs donated by companies including OpenAI, Microsoft, and Nvidia. In a March interview at the Daytona Compute Conference, Patel said that hyperscalers like Google, Amazon, and Microsoft were slow to move into AI, creating an opportunity for "NeoClouds" that could skip complex legacy software. He also noted that the entire cloud market had run out of CPUs, with Amazon's CPU server installations tripling year-over-year. In an April interview with Patrick O'Shaughnessy, Patel said his firm's AI token spend had skyrocketed from tens of thousands of dollars annually to $7 million, driven by non-technical staff using AI for coding. He stated that "ideas are cheap and plentiful but execution is very easy," and warned that people who do not use more tokens, generate value from them, and capture that value will "never escape the permanent underclass." Patel also predicted a "large scale protest against Anthropic and AI," citing a Pew survey that he said showed AI is less popular than politicians. In a panel at the Beyond Summit, Patel asserted that vendor benchmark claims are "lies, impossible to achieve," and that "if you're not pissing off people with your benchmark, then you're not testing something useful."
“the AI market growing so fast and inference demand growing so fast that the price of 3-year-old GPUs is soaring. Right? In the just in the last 6 months it's gone from, you know, deals for 1 year transacting at 170, 160 an hour for H100s to now 240 plus. And and you know, in reality there's actually no spare capacity t...”
“network performance drives a not just a hey 20 30% it's actually multiple like X's, right? It's 5X 10X performance difference if you have really good networking versus not.”
“as we've seen recently with Nvidia and their acquisition of Groq, um there is a tremendous focus on that super race car side of the uh fence. And part of that race car is that you need to have incredible networking performance.”
“In the age of AI the hyperscalers were a bit slow to move right Google Amazon Microsoft bit slow to move into AI and so a whole new crop of companies popped up and there was a new low bar right there's no need for a lot of the complex software that Amazon Microsoft Google had built up and a lot of this in fact slowed d...”
SemiAnalysis founder Dylan Patel breaks down the explosive growth of AI infrastructure, the rise of ClusterMax and InferenceMax ...
Dylan Patel (SemiAnalysis) & Ivan Burazin (Daytona) Live from Daytona Compute Conference, Chase Center SF, March 9, 2026 ...
From the Beyond Summit 2026 stage, David Kanter, Founder of MLPerf, Dylan Patel, Founder of SemiAnalysis, Micah Hill Smith, Co-Founder & CEO of Artificial Analysis, and Jeff Tatarchuk, Co-Founder & CGO of TensorWave, discuss the evolving and often controversial landscape of AI benchmarking. The panel explores the critical role of independent, third-party evaluations in challenging vendor claims and providing developers with reliable data on cost, intelligence, and real-world performance. From the early focus on training accuracy and architectural neutrality in MLPerf to the rapid, daily perf…
원본 영상: • The Supply and Demand of AI Tokens | Dylan... Patrick O'Shaughnessy sits down with Dylan Patel, founder of SemiAnalysis, to explore the explosive supply and demand dynamics of the AI revolution. Dylan shares how his firm's token spend skyrocketed to $7 million a year, completely transforming their productivity and highlighting a new era where execution is cheap, but high-quality ideas are at a premium. They dive into the implications of Anthropic’s frontier models like Opus 4.7 and "Mythos," the hidden bottlenecks in the semiconductor supply chain (including memory, TSMC, and CP…
Patrick O'Shaughnessy sits down with Dylan Patel, founder of SemiAnalysis, to explore the explosive supply and demand dynamics of the AI revolution. Dylan shares how his firm's token spend skyrocketed to $7 million a year, completely transforming their productivity and highlighting a new era where execution is cheap, but high-quality ideas are at a premium. They dive into the implications of Anthropic’s frontier models like Opus 4.7 and "Mythos," the hidden bottlenecks in the semiconductor supply chain (including memory, TSMC, and CPUs), and the economic phenomenon of "phantom GDP." Finally, D…
0:00 Intro 1:22 What is codesign? 2:49 Codesign example: Swish vs ReLU 4:22 Are DeepSeek papers codesign? 6:45 Predicting where ML research will go 8:06 Should researchers hate your chips? 9:34 Can you codesign too much? 13:23 Picking the right grain size for specialization 16:22 How much hardware flexibility for The Age of Research? 20:05 Did reasoning and RL disrupt hardware roadmaps? 23:09 Cerebras/Groq: unexpected wins on reasoning and RL 25:34 Disaggregating MLP and attention 29:06 The right metrics for quantization and codesign papers
Full episode: https://youtu.be/mDG_Hx3BSUE Me on x: https://x.com/dwarkesh_sp Dylan Patel, founder of SemiAnalysis, provides ...
Dylan Patel breaks down the current chaos inside the world's top AI companies. Dylan is the founder and CEO of SemiAnalysis, ...
In this episode, Dylan sits down with Waleed Atallah to talk GPU kernel optimization, the growing complexity of deploying AI across diverse hardware, and why automated kernel generation is becoming essential infrastructure for the next era of compute. Waleed is the CEO and Co-founder of Makora (formerly Mako), an AI infrastructure company building the hardware-agnostic performance layer for GPU compute. Makora automates GPU kernel generation and tuning, helping developers deploy models faster with better price-performance across NVIDIA, AMD, and custom accelerators — no rewrites required. Bef…
Patrick Moorhead, CEO at Moor Insights and Strategy, and Dylan Patel, CEO at Semianalysis, join 'Closing Bell' to discuss each expert's take on Nvidia's recent backlog comments, if the company can dominate its next phase and much more.
Sign in to search the full transcript archive, filter by topic, and access every quote from Dylan Patel.