Rita Klug1:02:06
Hello everyone, my name is Rita Klug. I'm the VP of product for Cloudflare's developer platform. It's the year 2025, so obviously we're going to talk about AI. And I promise I will mention the hot word right now, agents, as well. But before we get into all of that, I'd like to start with Cloudflare's vision for developers, which I think is really inspiring. Cloudflare's vision for developers, in a nutshell, is to make it possible for every single developer out there to bring their ideas to life, and to make that as easy as possible, from the moment that they write their first line of code to deploying it to the first user and the millions of users to come after that. But here's the thing, making something easy is actually in and of itself really hard. And to do this, when we started on the developer platform eight years ago, we had to take a really big, really bold, and even non-consensus bet. And that was to build our architecture for the platform on top of something called isolates. Now Matthew talked about this before. All code running on the server up until this point had been running on a virtual machine or container. And the problem with that model is that it makes it fundamentally impossible to abstract away infrastructure in a way that really allows developers not to worry about it. We know this actually because Cloudflare's entire first act, when you think about it, has been about bridging that gap between the way that the centralized cloud running VMs and containers works and the way that users actually interact with the internet. Now with isolates, isolates compared to VMs or containers are really, really lightweight, which means that they can spin up really, really quickly and reliably. And they give you that platform that you need in order to be able to run fast. When we made this decision, we knew that we were making a tradeoff, that it would be harder to lift and shift existing applications onto Cloudflare's platform. But when you're making a bet, you want to bet on the future and not on the past. And so we wanted to build the best platform for building greenfield applications. And what's really incredible is that today, if you're a developer, Cloudflare offers you an entire full-stack platform of offerings. Everything from compute to storage to AI and media, literally all of the different pieces that you need in order to build an application. And every single line of code that you write gets instantly deployed to Region Earth, which means that you never have to think about scaling it. It's just fast, it's performant, it's reliable, all of these things from the very beginning. Now, a few clever developers clued on to Cloudflare being their secret weapon for running ahead of everyone really, really early on. And for those developers, well, I have a little bit of bad news. And it's that, well, the secret is out. This platform is not so secret anymore. Today there are over 3 million developers that are building on Cloudflare, and we're continuing to see that number grow and accelerate. And part of this is due to this large paradigm shift that we're seeing with AI. We started to talk about this shift a year ago, and how it represented something as big as the cloud, mobile, and social before it. And that statement is truer today than it actually was a year ago. Now I realized that I'm talking to a finance audience, and you all get held accountable to your predictions all the time, so I thought I would give you a chance to hold me accountable to some of my predictions from a year ago. So a year ago, we talked about how 44% of developers, no surprise, were already using AI to help them write code. And analysts were predicting that by the year 2030, about 50% of knowledge workers would be using AI to help them augment their day-to-day tasks. And I remember seeing this figure and thinking, well, they're being a bit conservative. And guess what? They were. Today, not in five years, but already we've surpassed that figure with 75% and not 50% of knowledge workers already using AI. And the number of developers, well, that's almost doubled as well, from 44% to 76%. But there was another, actually more interesting prediction that we made, and it was that workloads in AI were going to shift from training to inference. And guess what? We're starting to see that play out in real time as well. We're seeing that with OpenAI reporting that their new O1 model is now spending more and more time in inference than in pre- and post-training. We're seeing a similar thing with DeepSeek, who is optimizing training so much that they're starting to spend again more of their time in inference. But this is all a year ago. This is all a recap. So let's talk about what's coming next. And while we're still in the early innings of that migration from training to inference, the next shift that's going to happen is going to be around inference being focused rather than on augmentation, to inference being focused on automation. When you hear the words 'agent,' that's exactly what people are talking about. So I know that there's a lot of jargon that's flying around right now, and I think people are using the word 'agent' very loosely. So I wanted to share my framework for thinking about the different types of AI. The first type of AI is predictive AI. It's been around for a long time. It's often referred to as machine learning. Cloudflare has been using predictive AI since the very, very beginning of the company. And a really canonical example here is taking something like all past internet traffic and looking at it in order to be able to predict whether a new piece of traffic is a DDoS attack or someone just having a Black Friday sale. Now, generative AI is the new type of AI that's come in over the past couple of years, and as the name would suggest, it allows you to generate something that didn't quite exist before. That can be a piece of code, some text, an image, a video. But generative AI has largely been focused on making us more efficient through augmentation. So rather than having to manually artisanally handcraft an email, I can now ask AI to assist me with that. Now, agentic AI is focused not on just the augmentation piece but the full end-to-end automation of a task. So let's run through an example. I think everyone here is familiar with a UI that looks something like this. Maybe you've asked it for some help with a task recently. If I'm a junior salesperson and I need to follow up with a customer, I might ask it to help me write an email so I can follow up more quickly. With an agent, AI can not just help me write the email but it can help me automate the entire end-to-end flow of what you might ask a junior salesperson to do. So if I went to a conference last week and I met a whole bunch of people, I can ask an agent to generate me a list of all of the people that I talked to, then to write me that email. But it's not going to stop there. It will let me review that email if I want to, but then it will actually send it on my behalf. And when a customer follows up, it can either notify me or even write back to the customer. So how do agents work? Agents are really made up of three core pieces. So there's the AI piece, that's the thinking of the operation. There's the workflow piece, that's going to be the executive arm that helps the AI then go and make sure that those things all happen. And then there are the APIs or the tools that actually go and make those actions. But now let's take a look at this through a developer perspective. What would it take for someone to build an agent? So first you need a way to get user input, and more and more we're seeing agents that are communicating over voice. So for that you would need something that talks over WebRTC, which is the internet voice protocol, and then you need to convert that from speech to text in order to execute those next steps. You might also use a chat UI, that's a very common pattern. You need somewhere to host it. From there you need somewhat of a gateway like Cache so you can cache responses. If a query comes in that you've seen before, you want to be able to respond fast as opposed to doing the whole thing all over again. You want to be able to observe the AI, make sure that it gets better over time as opposed to worse. Then you need access to the actual large language model, the LLM, that's going to generate all the content, that's going to come up with a plan, that's going to constantly be doing the reassessing of it. And the LLM needs then access to an orchestration layer, kind of like the workflow. For the orchestration layer, you need a unique combination of compute that can actually do the task, but you also need some storage in order to keep track of all of the tasks that have been completed and make sure you're executing on the next one. You also need access to a whole bunch of different tools. So that can be a browser. If I go back to the example of a salesperson sending an email, if that email needs to be personalized, you might want to go to your customer's website, make sure you're including relevant information from there. You need an API for sending the actual email. You need maybe to connect to an internal service to pull up any open support cases to make sure that you're addressing them in the email. And then you need a vector database to hold all of the domain knowledge that's related to your own industry. Finally, sometimes AI is going to need to ask a human for help. This is commonly known as a human-in-the-loop architecture. Now, when we talk about building out all of the primitives and all of the pieces that developers need to build a full-stack application, agents are a perfect example of that type of application. And today Cloudflare provides all of the different pieces that you need in one place in order to build an agent. We offer everything from real-time, which was previously known as our Calls product, to take in that voice input. We have Pages that can help you with the UI. We have our AI Gateway that can help you with caching responses and with constantly evaluating and observing your AI. We have Workers AI for hosting the models themselves, whether that's the LLM or the voice-to-speech, the voice-to-text piece. We have all of these different tools for your agent to be able to connect to, whether it's a browser, an API, an internal service behind Zero Trust, or a vector database. And we have actually the perfect pieces for that orchestration layer. So if in the past you've seen us talk about a primitive that might have seemed a bit more obscure, like Durable Objects, it is actually such a good building block for exactly an agent, because it takes those two things, compute and storage, and puts them together into one piece. Now, I'm not going to take credit for having invented agents, though funnily enough we did consider agents as a name for a Durable Objects product. That would have been very prescient. But we did inadvertently build the best platform for building agents. And it's not just about having all of these different pieces. You might be thinking, well, don't cloud providers have hundreds of different services? Surely you could spin something like this up on one of them, right? So what makes Cloudflare unique? Why is Cloudflare the best platform for building agents? This comes down to three key reasons. The first is the cost and scalability that are always going to be intertwined with each other. The second is performance. And the third is developer experience. So let's get into it. I'll start with the cost and scalability piece, and within that I was talking before about how agents are made up of AI, workflows, and APIs. So let's start with that AI piece. Now Matthew was talking before about how GPU utilization tends to be really low for companies. It can float at around 30%. For 30% of organizations it's around 15%. So why is it so low? Well, if you were to run a workload on a hyperscaler, the very first thing that you need to do is pre-provision all the resources you're going to need. Now for training, this is very easy because you have all of the variables ahead of time. You know how large a data set you're looking at, you know how large a model you're going to train, and you're fully in control of the start and the stop buttons. Inference, on the other hand, is incredibly unpredictable. This is because inference relies on human behavior, and that's going to rely on what time it is in the day and what people are doing. Many of our customers are also launching inference products for the very first time. They have no frame of reference to look at for how the product is going to do. So what they have to do is hope for the best and provision their capacity right around that top line. So regardless of how they do, that's just how much capacity they need to have in order for the best-case scenario. But what ends up happening in reality is that again, that traffic is going to continue up and down, and with any other cloud provider you are paying for that 70% of resources to in reality be sitting there completely idle. With Cloudflare, we took a serverless approach to inference, which means that we can spin up those resources as you need and scale up, and that we're only going to charge you based on the usage that you end up having. So all of this blue area is what the cost savings look like for our customers. And we're not doing this just because we're being nice, although this is a nice benefit for our customers, but it also helps us get better utilization out of our resources, because it means that when one customer is having that trough in traffic, we're able to schedule another workload on those exact same machines. Now this is just the AI piece. Now let's talk about the workflows piece. I was talking before about how the orchestration layer needs to connect to all of these different pieces outside of it, and all of these are again outside of your control. Frequently it can be an LLM that's going to take several seconds to think. It can be a human that could take minutes, hours, days to respond. On any hyperscaler out there, you're paying for that entire duration of the request. This is known as wall clock time, because if you were to look at the clock on the wall from the moment that an operation starts to the moment that it ends, you're paying for that entirety of time. Now I talked in the beginning about the big bet that we took with isolates, and this is where it really pays off. Because isolates are so lightweight that we're able to spin them down when we're waiting on something else to happen and then spin them back up just as quickly. This is something containers can't do. They're just not going to spin up quickly enough, so you have to leave them on the entire time. And we're able to pass those cost savings onto our customers. I'll use an analogy that I think every single person in this room is familiar with, which is if you've ever been in a cab to JFK and you're sitting in that stop-and-go traffic, there's nothing more unnerving than watching that meter tick up even though you're not moving. This is what it's like to be paying a hyperscaler for any of these workloads. Now, if you can imagine that you would only pay for the bits you're actually getting closer to your destination, that's what it's like to be building this on Cloudflare. So cost and scalability was the first piece. Now let's talk about performance. And keep in mind that for agents to really take off, performance needs to be really, really fast, because our expectation of agents is going to be the same as it is of humans. And this is where Cloudflare is in the perfect Goldilocks zone of being able to service these use cases, because we're able to get as close to the user as possible but with the entire power of a GPU or an entire data center behind it, as opposed to devices that are always going to be just too small to be able to run the types of workloads you need for an agent. But it's not just the ability to get close to the user, it's about that end-to-end experience of an agent. And this is where we built a really connected and a really smart network that's able to move workloads around based on where it's going to make the most sense for the entire flow to run. And sometimes that does mean that you're going to have the AI piece, the API, all of these things running in the same data center, which is going to lead to the best end-to-end performance. Now, the third reason Cloudflare is the best place for building agents, and this is the one that's closest to me because I started my career in software engineering, it's the developer experience. And I've built applications on the hyperscalers, and the thing that's so painful about it is that you spend about 50% of your time on things that again are not getting you any closer to your destination. And actually, in this environment where things are moving so quickly with AI, every single day there's a new innovation, you just don't have time to waste on all of that. Whereas again, on Cloudflare, because you get all of these benefits out of the box, you don't have to spend time on them. That's time back that you get as a developer to actually build the best application possible. By being able to combine AI, compute, storage, our network, and then obsessing over the developer experience, we can create experiences that really resonate with developers. In fact, recently a customer was telling me about how ever since they moved their development from a hyperscaler to us, they have a new problem where their developers are going so fast that their product managers are having a hard time keeping up with them. That's a world-class problem to have. But it's not just about having the pieces, it's about making sure that they all connect and integrate with each other in a really seamless way. And this is what we spend a lot of time thinking about and optimizing. Developers' workflows. Just a couple weeks ago we announced our agents framework that allows developers to spin up an architecture that's quite complex like this all in a single command. That's what we talk about when we talk about developer experience. So why is Cloudflare the best place for building agents? Well, Sun Microsystems had it better than we could have, they were just a few decades early. But it's because the network is the computer. And for agents, you need a really smart computer, you need a really interconnected computer, you need a really private computer, secure computer, and Cloudflare's network offers all of these different things. And we know that this is resonating for developers, because developers sometimes they tweet about it, but more importantly, when developers are really excited about a technology, what they build is just really incredible. And today already over 13,300 AI companies are building on top of Cloudflare, and I'll share just a couple of the use cases with you today, but I am really, really inspired by the innovation that's coming out of developers. So one use case is in finance. It's an agent that helps out with actually due diligence, and it will go and research everything about a company, including the funding that they've raised, the founders, the team size, their budgets, all of that, and help you come up with the next decisions to make. There's even a company that's actually in the medical space that's helping build an agent for clinical trials, and it will go and analyze and summarize all of the data that's been collected so far and help doctors with those next steps, just as you would expect of a lab assistant. And these are just the tip of the iceberg. We're seeing so much innovation coming out of developers every single day. But before I go, I wanted to leave you with one final thought. We talked in the beginning about the big bet that Cloudflare took with structuring our platform on top of isolates. And at the time we got a lot of skepticism. People were thinking, well, it's a pretty big tradeoff that you can't take applications and lift and shift them. And yes, you can be the best in greenfield, but is that market even large enough? Aren't the incumbents too big? And this reminds me actually of something that happened in another industry where technology completely transformed the landscape. So 15 years ago, you might have asked similar questions about photos. Cameras had existed for over a century, and the players were pretty well established. But then a new technology came along, and we started carrying around a smartphone in our pockets at all times. And this completely changed the trajectory of the industry. All of a sudden, in the matter of a couple of days, the same number of photos was being taken as would have taken years to take before that. We're in a similar inflection point right now with AI, with AI driving an explosion of new applications that are being built. In fact, over the next five years, more code will be written than has been written over the course of the entire history of software. And for those applications to be successful, you need a platform that's going to make it as easy to deploy that code as it is to write that code. And that's exactly what we've built. The greenfield opportunity is bigger than it's ever been with AI and now agents, and I can't imagine a company better positioned to win all of that opportunity than Cloudflare. It's an exciting time to be a developer, and I'm really honored to get to work on this every day. So with that, I'll pass it over to Mark.