Mike Krieger & Jesse Zhang | Decagon Dialogues 2025

🎥 Sep 01, 2025 📺 Decagon AI ⏱ 22m 👁 349 views

... interesting conversation upcoming about the future of AI Please join me in welcoming Jesse and Mike to the stage Awesome.

Watch on YouTube

About Mike Krieger

Mike Krieger, co-founder of Instagram and now Chief Product Officer at Anthropic, discussed the company's product strategy and recent developments in two interviews. In a December 2025 conversation, Krieger said he views Anthropic's focus not as "B2B versus B TOC" but rather on "what are you trying to help people do," adding that the company is less focused on "entertainment, consumer use case." He noted that Anthropic's partnership with Amazon pushed the company to accelerate optimization work, and he described the company's culture as one where "research folks talking product all the time." Krieger also stated that about 100,000 people had used Anthropic's Claude Code product within a week of its launchchers, and said that with the Claude 3.7 Sonnet model the company "cut refusals down about 45%." In a March 2026 interview, Krieger said that AI models are "good at adding features" but not necessarily "about figuring out what to cut out of the product." He described the ability to go "from zero to end pretty quickly over the matter of hours" and said he had Claude rebuild the Bourbon app in about two hours, making it "feature complete." Krieger advised founders that "it's ultimately a product that you're building" and that products need to "solve problems 10 times better than anything else," cautioning against overemphasis on temporary technological advantages.

Source: AI-verified profile updated from Mike Krieger's recent appearances. Browse all interviews →

Transcript (22 segments)

✨ AI-enhanced transcript with speaker attribution

Host0:04

We have one more session left today before we break for happy hour. Get everyone a chance to chat and mingle. So, to wrap things up today, really excited for a quick sit down chat between Jesse and Mike Krieger. Mike is the chief product officer of Anthropic and before that the co-founder of Instagram. So really interesting conversation upcoming about the future of AI. Please join me in welcoming Jesse and Mike to the stage.

Awesome.

Mike, welcome. Thanks for joining us.

Mike Krieger0:34

Good to be here.

Host0:36

So, for those who don't know, Mike was one of the co-founders of Instagram. Currently, he leads product at Anthropic. And so, maybe we start there. I think it's obviously a very cool and exciting journey that you've been on with your product background and especially building consumer apps. How has that been applied or merged into your world today, which is a frontier AI research lab?

Mike Krieger0:59

Yeah, I think there are two challenges that the Instagram experience directly led me to on the Anthropic side. The first one is trying to make something that's emerging more understandable. We take camera phones for granted now, we don't even call them camera phones, they're just phones. But the idea of mobile interfaces and that being the primary way we might interact with something was new when we were building Instagram for the first time. Similarly, you're seeing this paradigm shift where people are getting used to the idea of interacting with AI in all sorts of form factors, whether voice or text. So understanding meeting people where they are, but also being really experimental on that edge of the capabilities, because the same UI paradigms that worked for classic web apps are not going to work on mobile. The same pre-generative AI interfaces also start breaking down. That's the first piece. The second one is actually maybe more surprising: as Instagram grew, we also became a platform for businesses to build themselves on. Similarly, one of the things I was really excited about coming into Anthropic is that sure, we have our first-party products and we'll work hard to make them understandable and user-friendly, but it's also a platform to power other companies like Decagon and others. That was very exciting for me because my in-between chapter between Instagram and starting a second company was doing some investing, and I liked seeing a lot of companies. I liked feeling that my day was not just a single product, but I missed building. So this is the best of both worlds.

Host2:34

Awesome. So transitioning that into the theme of today and what's talked about a lot now: how do enterprises get ROI from these models? There's a lot of debate about some pilots and use cases that work and some that don't work as much yet. What is your view on what separates use cases that are getting a ton of traction in the market from others that have struggled?

Mike Krieger2:59

Yeah, I think the art and science of AI deployment, as many of you are experiencing as you roll it within your own companies, is that we have a phrase I heard internally and love: there are two ways of timing an exponential: too early and too late. It's really hard as these capabilities are advancing. Too early, you end up in that trough of disillusionment from people who rolled out internally where you've overpromised. Some products that were earlier to market really did this: 'Great, you're just going to hit a button and all your work will be totally automated.' People try it and it's like, 'Well, it made something, but I'm not sure I'll ever do that again.' Or products that push too far or too quickly on autonomy or delegation before the actual quality is there. But you don't want to be too late and feel left behind while other companies have adopted and accelerated their own efforts. For us, the most successful deployments often start by having champions inside enterprises who are constantly playing with these capabilities. I was giving a talk last week with a very large bank, and they're fortunate to have leadership that day in and day out plays with these models and gives us feedback. It probably reflects more poorly on us than well on them: we'll report brokenness in features we didn't even notice ourselves. That's a sign we need to improve our auto-detection, but it also shows they're really out there trying these different pieces. What's interesting is that when it leads to a deployment, whether it's an internal productivity tool or a customer-facing feature, it's not grounded in the myth that AI will solve every problem, but more around 'I tried it, it's really good for these pieces, we're going to put it in production.' I love the conversation around starting with an ambitious but scoped use case and building up from there, with really strong evaluations: 'Is this actually working? Can we scale this up?' Then what we find is that one successful use case starts unlocking others, because other product teams see a good deployment and ask how they did it. We also find that the first several months are just getting through contracting and getting comfortable. So if you can land that initial use case, the second, third, and fourth are much easier.

Host5:39

Yeah, totally makes sense. And speaking of use cases, what we're seeing in industry is that there have been two big horizontal use cases that have gotten a lot of traction: one is customer service and the other is codegen. I know you had some interesting thoughts on why those have been successful and the similarities between them. Codegen, for example, with cloud code or Cursor, there are a lot of companies having success helping engineers code. So maybe you can speak to why those two use cases have been successful and what could be applied to the next big use cases.

Mike Krieger6:17

Yeah, it's kind of funny. If you talk to customers of our models, the two things they often say the models are really good at are coding or using tools in general, and having a really good voice and being very friendly. You wouldn't expect those two to necessarily emerge from the same model, but it comes from the longer arc of research at Anthropic. We want to build powerful, intelligent AI, and we want to do it safely and responsibly. We think the critical path is models that can do two things really well: one is plan and work on tasks for longer and longer, which fits well into codegen but also into many customer service use cases where you don't just want a single answer. A few years ago, we could have done retrieval off an internal knowledge base and answered a simple question. What you really want is what we saw in the demos today: understanding context, retrieving data, and taking actions to ship a replacement. That all involves acting agentically. The second is models that can interact with humans over time, because many important interactions on the way to powerful AI will involve collaboration with humans. So those two pillars of our research are long-horizon tasks and Claude having a character—not a single character, because each customer steers it differently, but fundamentally an empathetic voice you want to interact with that doesn't feel robotic. Those have been two use cases where we've seen customer traction. Customer service actually took off before codegen in the initial phase, but I think there's more in common between those two than you might think. As we think about the agents we're building even inside Anthropic for problems outside of code, we deployed one for our legal team to help with redlining, and one for our security team to intake new product ideas before they hit a security engineer. They're all using the same tool-using, agentic loop, and they might write code under the hood, but the person on the other end doesn't know that. They're just trying to solve a problem, and Claude is solving it for them.

Host8:34

Makes sense. You mentioned that as a lab, you've been developing some of your own applications. I'm curious long-term how this plays out. A lot of your team is hardcore training the models, making sure they get better. You also have parts building applications like cloud code and maybe eventually further applications. What do you think will be the way that frontier labs end up working with application companies like Decagon, and how will that evolve?

Mike Krieger9:03

Yeah, I describe it as an L-shaped platform. The breadth of the platform we want to power is everything, from companies like Decagon to life sciences, healthcare, and financial services. There's a lot of specialized knowledge about how to go to market, tools to be built on top, and integration into customers. That entire relationship is differentiated and unique for all those companies. We'll choose to build some verticals when it's something we know a lot about, like code. Or places that will teach us a lot about a particular vertical. The feedback from having cloud code in market directly informs the next model releases and lets us improve. The feedback from work on the knowledge worker horizontal with cloud for work has been helpful in training the model to produce PowerPoint documents better. So that's where we choose to build vertically. Connecting back to the Instagram experience, we were really focused on a focused set of features we wanted to do extremely well, rather than a platform that did everything not as well. We're applying the same philosophy: we're not going to do 500 things, but a few things we try to go deep on.

Host10:25

Got it. That makes a lot of sense. You mentioned there's almost a competitive differentiation element. In the AI space, there are so many companies building different things, both in applications and with AI labs training foundational models. How do you view competitive differentiation in the context of an AI lab compared to other labs or new research labs popping up?

Mike Krieger10:53

Yeah, there are a few aspects: the model itself, the platform around it, and the relationship with companies building on your platform. I'll go in reverse order because relationships are really important. When I came into Anthropic about a year and a half ago, it wasn't obvious we would compete with other frontier labs. We had less funding and a much smaller team. Claude 3 had just come out. My hypothesis, shared by many, was that we could just lean in and care more, really lean into every customer and make sure they're successful. You see it in how much I join customer Slack channels. When customers are gearing up for a big launch, we try to get them over the line so their success is partnered with ours. Then there's the platform around the model. The first version of many APIs was just 'give us your prompts, we'll give you an answer.' That's fine for getting started, but over time there are more complex use cases: memory, multi-turn agentic use cases, helping improve prompts. We're building more than just models as platforms. In terms of differentiating on the model, I wish I could say labs will find a unique secret sauce no one else can replicate, but it's more like you can't stop running. We describe it internally as a marathon of sprints. It feels that way for the research team especially. But it's also what we signed up for, and it's an exciting moment. Each lab has its own taste and approach. It's not that we're building the same models, but it's a non-stop competition to push the frontier forward. The difference with Anthropic is we're trying to do that while also pushing forward the notion of building these models safely and showing that you can do both—they aren't in tension.

Host13:19

Yeah, our team definitely resonates with that. We think about the same things at a granular level, working with customers every day to make sure problems are solved and they're happy. But in the long term, it's a large market with a very obvious use case. It feels like a constant sprint: there's always more to build, you have to be the fastest. So I think that's true for applications as well. Maybe shifting gears: one big theme with AI is the future of work for humans. There are many jobs AI can do fairly well now. What do you view as the role of the human worker as AI gets more productive, and how will that change over time?

Mike Krieger14:13

Yeah, this will definitely evolve. One thing we think a lot about is the comparative advantage for humans. A really strong pillar is trust-building and relationships. We have a growing sales team, and even as we bring Claude into parts of the sales process—prospecting, composing initial reachouts—it's an accelerant for good sales folks, but it's not going to close a sale or make a long-term relationship happen. It's more about building trust and deeply understanding the problem. There's a human at the other end of the line, at least for the next few years. So we need to keep that piece human. When I think about our own product team, we use Claude in many ways: honing our thinking, understanding user feedback at scale in a privacy-preserving way. But none of the models are at the point of creating novel ideas or synthesizing data in a way that sparks something new. There's a back and forth: our product team is deep in user problems, creating new ideas, and the models are great for vetting those or doing research. We still see it as a back and forth. As model capabilities improve, that comparative advantage might change. You might see more bifurcation, as in the case of support agents where more complex problems are being brought in. I think you'll see bifurcation and a continuing comparative advantage, but that's not a fixed point. It won't be the same three years from now. So from a company perspective, we're trying to get folks on that journey of experimenting with AI so they're not caught off guard and can continue to find where models are getting better and shift their roles accordingly.

Host16:37

Got it. One reason we use Anthropic models is that for enterprises, security and trust always come up. Anthropic has done a good job building a brand of being enterprise-friendly, deployable in VPCs. Is that something you think about when developing models, and what's top of mind from a trust, safety, and security standpoint when building these new developments?

Mike Krieger17:09

Yeah, even from the beginning, and it's ramped up more in the last year and a half. We've taken a more B2B enterprise view, which is funny coming from a consumer world, but I wanted to do something that didn't feel exactly like what I was doing before. There are a few pillars that matter. One is the partners we've chosen: being available in AWS and GCP, meeting enterprises where they are in their relationships with cloud providers. That's been important. Deploying in a way where we can't physically look at prompts and completions of enterprise data is really important for many deployments. At the model level, it wasn't obvious to me when I started, but now it's clear: the work on making models safe actually makes them enterprise-friendly. You want models that are aware of their limitations and won't make things up, that are good at instruction following so they don't go off somewhere else, that hallucinate as little as possible. We train our models to ground themselves in real data. You also want models that are hard to jailbreak. These two things sound different—important for customer service and for mitigating bio risk—but they're grounded in the same thing: you want models to remain true to their instructions and not be jailbroken. It's been cool to see that it's not that we can be both enterprise-friendly and safe, but that we are enterprise-friendly because we build models safely. We hear that from customers who say the model was robust against jailbreaking and honest, which made it the one they wanted to deploy.

Host19:12

Great. One last topic: people love talking about AGI. How would you define AGI, how long will it take to get there, and what are the big milestones we haven't reached yet on the road to AGI?

Mike Krieger19:23

It's become very clear to me, and I think to a lot of people, that this is not going to be a zero-to-one moment where we just say 'we got to AGI.' It's much more of an evolution of intelligence that is increasingly spiking in some areas and still not good in others. You see this in Claude's ability to write and run code for hours at a superhuman level, but if you put Claude in a Slack channel, it still has a long way to go before it sounds human—knowing when to chime in, how to synthesize, acting human. There are many places where it's not there yet. You'll continue to see it superhuman in many ways and still making mistakes a human wouldn't. I don't know when you'll be able to demarcate it and say 'this is AGI.' Maybe a good barometer is whether you'd trust it to be a full-time employee. We have an internal metric: the co-worker Turing test. If you put Claude behind Slack, would it pass for a full co-worker? None of the models are there yet. There's a long way to go. Big areas left: maintaining memory intelligently—what's in the model's context, what did it remember, can it say 'I did this three months ago' and remember the right things? Managing memory over long horizons is a big piece. Taking feedback and incorporating it correctly is another. Knowing when to be proactive and when to lean back is another. Vision has improved, but even now, models can generate UIs but can they look at them and say how they could be better? That loop isn't fully there. Even in coding, there are complex use cases they haven't quite gotten to. So these are some dimensions, but none are like 'one click and we'll achieve AGI.' You'll continue to see each generation improve. If you look back, it was just February that we launched cloud code and Claude 3.7, and now in September, the models feel very different. I think you'll be able to look back five or six months from now and have a similar experience. Whether that feels like AGI in some ways, it probably will, but in others it will still be far off.

Host22:44

Yeah, super interesting. Well, thank you so much for coming, Mike. Let's give Mike a huge round of applause.

Mike Krieger22:47

Yeah, thanks for having me.