Explore Sign In

Ion Stoica on AI code generation

From Distributed systems to AI platforms with Mark Russinovich & Ion Stoica | BRK227 · May 28, 2026 · Microsoft Developer

“I think that in theoretical in theoretical from the theoretical perspective is very hard for me to see how it's going to be ever solved. And fundamentally, because we know from that no axiomatic system can be complete. So in some sense that's kind of tell you something. I think that when we look at other techniques to do it, like for me for instance, assume that you have a formal specification. You can write in Lean on Coke and so forth. So from that one, if I have the formal specification, then you can generate the code and then you can generate the proof that the code satisfies the specification. So that kind of is good. And you can see that the problem is that obviously then you need to write that specification, which is kind of not easy. And even that when you write the specification, it's cannot, you know, it's like if it's incomplete, right? You don't specify a property because you know, like in this case, there are the specification for load balancers, but they don't write, oh, you shouldn't drop the packets, right?”

Ion Stoica

Cofounder, Databricks

AI code generationformal verificationspecification completeness

On May 28, 2026, Ion Stoica, Cofounder at Databricks, spoke about AI code generation during Distributed systems to AI platforms with Mark Russinovich & Ion Stoica | BRK227 on Microsoft Developer.

Watch on YouTube at 41:21

Distributed systems to AI platforms with Mark Russinovich & Ion Stoica | BRK227

June 04, 2026 Microsoft Developer

Watch on YouTube at 41:21

What will it take to build AI platforms for the agent era? Join Azure CTO Mark Russinovich and UC Berkeley professor Ion Stoica (co-creator of Apache Spark and co-founder of Databricks and Anyscale) to explore how AI infrastructure must evolve as systems become agentic, multimodal, and globally distributed. Get practical insights on next generation architectures, from training to real time serving, and why open source, security, and governance are now core platform concerns. Seating for this session is first-come, first-served. Add it to your schedule to plan your day and arrive early to secure a spot. To learn more, please check out these resources: https://aka.ms/build26/BRK227 𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: Mark Russinovich Ion Stoica 𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This is one of many sessions from the Microsoft Build 2026 event. View even more sessions on-demand and learn about Microsoft Build at https://build.microsoft.com BRK227 | English (US) | Cloud platform & data Breakout | (200) Intermediate #MSBuild Chapters: 0:00 - Introduction and session overview at BUILD conference 00:00:36 - Speaker backgrounds and history in AI systems such as Spark, Ray, and VLLM 00:04:07 - Discussion on fundamentals of distributed systems applied to AI infrastructure 00:10:19 - Evolution of data centers and rise of large-scale AI supercomputing regions 00:13:33 - Modern serverless computing and its role in AI workloads 00:14:25 - Emergence of agentic AI systems and architectural implications 00:19:31 - Optimization layers in AI infrastructure: algorithms, hardware, and architecture 00:22:03 - Open source AI infrastructure stack and challenges of cross-layer optimization 00:29:30 - Security, confidential computing, and protecting sensitive AI data 00:34:48 - Developer experience, code verification challenges, and discussion on future automation limits

About Ion Stoica

Cofounder · Databricks

Ion Stoica, cofounder of Databricks and executive chair of Anyscale, spoke at a June 2026 conference about reliability as a major barrier to enterprise AI adoption. He argued that AI systems, particularly those using large language models, lack clear specifications and are difficult to debug because they function as black boxes. Stoica noted that moving an AI feature from prototype to production requires 10 to 50 times more resources than prototyping, and he cited a paper from Stanford and comments by Dario Amodei to support his view that ensuring AI agents are safe, reliable, and predictable is a central challenge. He also discussed LM Arena, a platform he helped create that has hosted over 250 million conversations and evaluates more than 700 models, and presented research showing that excessive use of emojis in AI responses is negatively correlated with user preference. In a May 2026 conversation with UC Berkeley professor Shankar Sastry, Stoica reflected on the founding of Databricks and the current state of AI. He described the proprietary model landscape as inefficient in its use of human capital, with engineers having little incentive to share knowledge. Stoica also commented on humanoid robots, saying the premise is "huge" but that it will take time for the technology to become profitable. He called for universities to adapt to the changing AI landscape and argued that federal taxpayer money should serve as "patient capital" for long-term research, rather than forcing the capitalist system to be more patient.

Profile compiled from Ion Stoica's verified public interviews and appearances. See all quotes & transcripts →