Mustafa Suleyman

Cofounder and CEO, Inflection AI

🔴Mustafa Suleyman Interview Defining AI Intelligence

🎥 Dec 01, 2023 📺 AI Hack Code ⏱ 33m 👁 79 views

#mustafa #mustafasuleyman #aiinterview Mustafa Suleyman Interview Defining AI Intelligence 🔴AI Genius IQ is your home for AI Technology! Subscribe Now: / @aihackcode Chapters: 00:00:00 Trailer 00:00:56 Introduction 00:01:31 How Mustafa Sulayman saw AI's future 00:04:58 AGI is the AI that we don’t yet have 00:08:38 Open Source AI is about to change everything 00:13:42 How can startups collect high-quality AI data? 00:18:55 What UIs will AI-first companies build? 00:20:16 Should we pursue fully autonomous AI? 00:25:24 AI's path to 99% accuracy 00:27:30 Designing AI for ambiguous problem d...

Watch on YouTube

About Mustafa Suleyman

At Microsoft Build 2026, Mustafa Suleyman announced seven new AI models from Microsoft AI (MAI), which he said were developed as part of the company’s pursuit of “Humanist Superintelligence.” He described this as AI “explicitly designed to serve people and organizations and not replace them,” and stated that the company’s philosophy prioritizes “human well-being and human progress.” Suleyman noted that the compute used to train frontier models has increased by “1 trillion fold” over 15 years and said that “intelligence is now a function of compute,” with three more orders of magnitude expected in the coming years. Suleyman also highlighted that MAI’s models are co-designed with Microsoft’s own Maia 200 silicon, claiming a “1.4x performance per watt gain” compared to running on the GB-200. He emphasized that unlike other companies, with MAI “you don’t rent intelligence from a shared model that learns from everybody,” and that users retain control of their own data and resulting models. Additionally, Suleyman announced a partnership with Mayo Clinic to jointly develop a frontier model for healthcare, stating that the goal is to “put the patient first, deliver the highest quality we can in a trusted way, and then hopefully share that with the world.”

Source: AI-verified profile updated from Mustafa Suleyman's recent appearances. Browse all interviews →

Transcript (49 segments)

✨ AI-enhanced transcript with speaker attribution

Mustafa Suleyman0:00

We're not going to have GPT-5 tomorrow and suddenly GPT-5 replaces all humans as the ultimate judge or teacher. You can get an 80% prototype going and it can look good, but a real consumer experience requires you to nail the 99th percentile experience. It has to be really high quality consistently. And as soon as the AI breaks out of character, gets something wrong, that destroys the illusion and it breaks trust. So then you lose your consumer. And so from a startup perspective, I think the real trick is finding existing data sources or, more importantly, creating a UI that allows you to collect high-quality data from an interaction with a product domain that you think is valuable, that is producing a moat of highly valuable data that you can then use to post-train and fine-tune your model again and get into that feedback loop. That is a path to creating an enormous amount of value, which I think is such a creative time in entrepreneurship.

Interviewer0:56

Mustafa, thanks so much for joining.

Mustafa Suleyman0:58

Hey Seth, good to see you again. Thanks for having me.

Interviewer1:01

So, very excited to have Mustafa here. As many people in the audience will know, Mustafa is one of the leading pioneers in AI. He's currently the CEO of AI at Microsoft, venture partner at Greylock, former co-founder of Inflection AI, as well as co-founder of DeepMind, which was of course acquired by Google, where he became VP of AI at Google. So, Mustafa, I feel very privileged to be able to work with you through Greylock, but also to have you here to share your take on the current space which is evolving so quickly. So, maybe to kick it off, I'm curious to hear more about how you decided to focus your career on AI before it was obvious.

Mustafa Suleyman1:38

It's kind of strange looking back on it. I had to write my TED talk recently and I reflected on how crazy things are, you know, 15 years later when we first started talking about founding DeepMind in 2010. It's hard to overstate how weird we were. Often people say being an entrepreneur is choosing to obsessively work on something that is really contrarian and attempting something that everybody else thinks is impossible. I think in our case, people didn't just think it was impossible. They thought it was completely absurd. And I honestly am not quite sure how we came to have so much faith in our ability to try to do something so out of distribution and exceptional.

Interviewer2:24

What was the initial insight around DeepMind?

Mustafa Suleyman2:26

We didn't just start working on AI or machine learning. We were fully committed to working on artificial general intelligence, producing a system that could exceed human capabilities and knowledge at all levels. And the reason we were motivated to do that is because we genuinely wanted to use AI to solve other problems and to make the world a better place. I think there was no academic lab or setting that could accommodate the scale of investment that we thought was necessary at that time. In academic labs, there wasn't a focus on large-scale engineering. There certainly was a focus on products, and even in the kind of big national project investments that you get from working in government, there was just nothing that was like a technical effort to try to understand what intelligence was at scale and deploy it on important problems. So it was really only startups that were as brave and courageous as was necessary to be successful on this mission. So it was just always obvious that I would do another company at that point. By that point, that was my third effort at a company, and it just felt like this is the only vehicle, because you learn so much along the way by screwing it up and figuring out how to do it right. And I had come off the back of a bunch of years working in the nonprofit sector, in government, in conflict resolution and facilitation. I'd started two small companies, actually one that was focused on selling networking equipment and electronic point-of-sale systems to restaurants, but this was way before that was possible, that was a failure. And the thing I realized is that we need more knowledge and insight in our world to help us address the overwhelming complexity of our systems. It's so difficult to make an intervention into a complex social system today, like an economy or a food production system or the financial system, and be confident that that intervention is going to make the impact that you think it is. And that's really one of the reasons why we need amazing AI. We need to be able to make good predictions about complexity in our world in order to create value and to change that world to help people live healthier and better lives. It sounds super cheesy, but that is really what was motivating us back then and still does.

Interviewer4:57

And I'm curious. So you had this mission that unlocking abundant intelligence for the world would actually solve problems that matter. How do you actually define intelligence in this context?

Mustafa Suleyman5:08

The thing that gave me confidence that we might actually be able to make progress towards inventing intelligence was our third co-founder, Shane Legg, had spent his entire PhD researching various definitions of intelligence and trying to aggregate them into a single metric that we could use to turn the science of intelligence and the neuroscience of understanding biological intelligence into an engineering effort, and really make that a measurable, quantifiable exercise. And the definition that he came up with was the ability to perform well across a wide range of environments. So again, emphasizing generality. And that was a major thing. Now everyone takes the AGI part for granted, as though G is the central part of intelligence, but that's an assumption. Generality happens to be one of the characteristics of intelligence, but it's not the only important characteristic. And it also turns out it's also very hard to measure and to scale it down to something that you can really grasp. Whereas another definition was the Turing test, of course, that a system would be intelligent if it could deceive a human into thinking that it was itself a human during natural conversation. And in some ways, we've sort of crossed the threshold of that intelligence, right? We have systems now that are really very good at conversation, and at least for a few turns, it's certainly better than humans in many respects. You can still tell that it's an AI or a chatbot and not a human, but in a few years' time, you really won't be able to tell. And yet that doesn't really tell us anything about whether these systems are actually intelligent. Every time we cross a threshold in terms of a benchmark or a milestone in AI, you then sort of turn around and go, okay, well here are all the problems with that mechanism of measurement, and here's the next thing that we need to measure.

Interviewer6:53

I think I've heard Reid say that AGI is the AI that we don't yet have. It's always pushed forward into the future.

Mustafa Suleyman7:00

Exactly. Exactly. And it's like the constant carrot that we dangle to kind of chase ahead to move ahead. So another measure that I have proposed is that we should focus more on the capabilities of the system, the actions, the things that it can do, the things that we can observe that it can have an impact in some environment, rather than this kind of abstract idea of is it general or is it good at having conversation. So that would basically be like, can it produce human-quality labor in a practical environment and actually earn money for that? Or could it write software, for example? It's a measurable thing. So I sort of called that a modern Turing test and said that in the next 5 years, a system would be able to take a very abstract goal, like go create a new product, get that designed, manufactured, drop-shipped, and then distributed and marketed, and try to earn a profit from it. And then you could measure that profit in terms of making a million dollars or something like that. I think there's going to be a system that can actually do that certainly before 2030.

Interviewer8:02

Yeah, that's amazing. Do you expect that that type of system would actually trade off on the G, on the generalizability, and it's actually built for a specific use case?

Mustafa Suleyman8:13

Yeah. I think that it's more likely that we'll have really powerful systems that are specialized for specific use cases that have real deep domain expertise, than we're going to have this sort of very general-purpose system that can switch from being a marketer to a clinician to being a doctor, being a lawyer, whatever. Clearly, the general case is going to come afterwards.

Interviewer8:37

So I want to spend a moment just getting your take on the state of large models today, and maybe start with level-setting for the audience of what was the inflection point that led us to the current state of the GPT-3, GPT-4 style models, the Inflection 2.5 style models, this combination of the transformer architecture and scaled compute.

Mustafa Suleyman8:58

I think that this revolution has been driven by deep learning. We are still building deep learning models, albeit with a slightly different flavor now, the transformer architecture from 2017. And we're now turning those into composable units that are essentially going to act like parts of our software development ecosystem. You're just going to turn to your AI and it's going to actually generate code for you. We're already seeing that with GitHub Copilot, and we're seeing it as a member of your team able to take a natural language instruction and just act in unison with you. And I think people don't quite realize that these models are not going to be large forever. In the history of all technologies that have been valuable, anything that is significant gets cheaper and easier to use over time, and that curve is sort of double exponential over the last couple of years. It's incredible. I mean, Microsoft AI just released Phi-3 fully open source. It is close to, but not quite at, GPT-4 level. It's fully open source. It's 3.8 billion parameters, right? So that's more than 100x smaller in terms of inference compute than basically the absolute frontier of models today. Like I said, it's not quite as good, but it's certainly as good as or better than GPT-3.5. That's mind-blowing. I mean, that's something that can fit on your laptop or on a phone. So we should expect that trajectory to continue. I think open-source models are going to be very close behind the closed-source proprietary API models, months or maybe even just a year or a year and a half, and that's just going to change the whole creation landscape basically.

Interviewer10:54

That's super interesting. And what enabled this model to be almost equal in performance while also being much smaller?

Mustafa Suleyman11:02

Well, for the last couple of years, everybody has been focused on reinforcement learning from human feedback, where in the final stages of training, at the stage of fine-tuning or post-training, you have a bunch of trained raters or judges that compare two possible responses or completions from a model, and that pairwise comparison provides large-scale feedback for the kinds of behaviors that you want your model to exhibit. Everyone's familiar with that now. But what we've been focused on, as soon as that was showing promising signs, for the last 18 to 24 months, is reinforcement learning from AI feedback, where we really want very smart and capable models to do that pairwise comparison, because obviously we can automate that process and we can produce an even larger number of supervised fine-tuned labels to give more feedback to the pre-trained model across a wider range of experiences and moments that might be in tension with one another if you only have a small number of samples that come from expensive, highly trained humans. So that was one method: reinforcement learning from AI feedback. And then the second is generating training data from these models. Sometimes people refer to that as distillation, where you're trying to absorb as much of the best bits of a big powerful model as possible, and then you're using that to post-train or align your smaller model. And parameter count is no longer the primary proxy for capability. High-quality data is the real valuable asset here, in addition to the architectures. So for the last 6 to 12 months, everyone's been focused on compute, compute, compute, can I get compute, and obviously that's kind of important, or large models, large models, but really it's investing in high-quality data. So from a startup perspective, I think the real trick is finding existing data sources or, more importantly, creating a UI that allows you to collect high-quality data from an interaction with a product domain that you think is valuable, producing a moat of highly valuable data that you can then use to post-train and fine-tune your model again and get into that feedback loop. That is a path to creating an enormous amount of value, and it does not require you to depend on the large-scale model providers, which I think is why it's such a creative time in entrepreneurship.

Interviewer13:35

Yeah, that's super interesting. As a startup, you're competing with incumbents that have access to large sets of data. So I'm curious if you could share anything more about the nuances of opportunities that might exist for startups to get specific types of data that's maybe more valuable than others.

Mustafa Suleyman13:53

Okay. So how do you go about collecting high-quality data? Because people have slightly different approaches. In pre-training, it's about volume of tokens, and there the hyperscalers will have a longstanding advantage because they already own search engines or YouTube or whatever it is. Whereas in post-training, you need a small number of very high-quality tokens to align the model to the behavior that you want for your product, and you can collect that from scratch. When we built Pi, we created, to this day, the most high-quality, humanlike conversational AI with the best EQ even today in the market, right? And we didn't use any data from big providers. We collected all of it ourselves from scratch by training paid teachers. We called them AI teachers. Some people call them raters. But the crucial thing for a startup is you have to really, really, really pay attention to training those teachers. You have to pay them a lot of money. I'll just tell you from our perspective, we selected people who had an undergraduate education, nothing less, that largely spoke English as a first language, with some exceptions, that had a domain expertise that we thought was valuable. Like maybe they said they were very passionate about history, or they had good cultural knowledge, or they were movie buffs, or whatever it is. They had to pass 20 hours of training and testing by us. So we would give them reading and comprehension exams. We would give them multiple choice questions. They would have to do sentence completions. They'd have to do spot the differences. They would have to do really quite hard analytical tasks. And in order to keep my team super humble about how valuable this task was, I would obviously also have all my team go through the same training and go through the same test. And I can tell you, not even a majority of people passed.

Interviewer15:56

Yeah, I was about to say I'm nervous. I sat the test. Yeah.

Mustafa Suleyman16:02

It's actually not easy. It's a pretty tough thing to do because you're asking a human to read through two 10-turn conversations, look at the proposed answer by one model and by another model, and then absorb a huge behavior policy, very detailed line by line: the AI should do X, shouldn't do Y, in this situation it should do this, and then remember the training from the AI teacher that says all kinds of subtle exceptions and stylistic tones and brand and capability awareness, and then you have to find the correct intersection of all of those to decide: is this paragraph more in line with the behavior policy or is this one more in line? It's a painful task.

Interviewer16:49

That's super interesting. I'm curious how you view this evolving, right? Because you mentioned we're moving from reinforcement learning from humans but also from AI. So how do you view application-layer startups between being vertically integrated versus which parts of the stack they really need to be experts in?

Mustafa Suleyman17:09

Yeah, that's a good question. I think you have to be very principled in answering that question, and that's where the bet of your startup is. You have to decide which part of it am I going to bet on. Obviously, a bunch of people are building tooling and infrastructure, and that's fine. We all are familiar with that kind of strategy. I'm a great believer in building and owning your own product and, as much as possible, controlling the key bit of the value there, which in my opinion is the LLM. Everything around that is secondary. The words that come out of the LLM are what you have to focus on. And that means that I think it's reasonable to break off the pre-trained model and get that from somebody else. That's a good approach. But I think you need to own your fine-tuning stack, and I would not give the fine-tuning stuff to somebody else. You have to train your teachers because that's not going to go away anytime soon. We're not going to have GPT-5 tomorrow and suddenly GPT-5 replaces all humans as the ultimate judge or teacher. I think that's quite unlikely. It's going to be a lot better than GPT-4, but even the people that have tried to do RL from AI feedback with GPT-4, the quality is okay and it is impressive. It's very cool, but it isn't on the verge of replacing humans entirely. You can get an 80% prototype going and it can look good, but a real consumer experience requires you to nail the 99th percentile experience. It has to be really high quality consistently. And as soon as the AI breaks out of character, gets something wrong, has a hallucination, whatever you want to call it, that destroys the illusion and it breaks trust. So then you lose your consumer. And so I think that the key thing for startups, at least for the next year, is to get really good at data collection and data filtering and data quality.

Interviewer18:53

That makes sense. I'm curious what are the different UIs that you think AI-first companies are going to be building, whether it's a chatbot, an agent, just regular SaaS that's enabled by AI?

Mustafa Suleyman19:07

Yeah, I think that in my opinion, the UI needs to get out of the way, especially for the consumer. Obviously for SaaS, you can have all the bells and whistles and all the developer features, right? But for a consumer, the goal is to get the UI out of the way. So we created a very pared-back, quiet, soft, calming, I think quite distinct-looking AI with very few buttons. But we also have one of the best voices in the world. In the end, I think we had nine or ten voices. They were really high quality and very, very humanlike. Pi is still live and people should try it out. But I think voice-first is a big part of the future UI.

Interviewer19:50

Yeah. I like that you have to choose the voice that you like as part of the onboarding experience.

Mustafa Suleyman19:56

Yeah. Because it's sort of like connecting with your AI. It's a personalization moment. And 30% of all our conversations took place on voice. And they were by far the longest, most engaged, most retained users. So people should keep that in mind. I think that's a very important insight.

Interviewer20:14

Yeah, that's super interesting. I've heard you talk about AI has IQ, they have EQ, and then you've talked about AQ, right, and action quotient. So I feel like this is a very interesting topical thing at the moment where people are talking about these autonomous agents for certain use cases, which includes reasoning and planning. I'm curious how far away are we from the kind of chatbots we've experienced that most people have experienced today to a fully autonomous agent that you described at the beginning that's capable of executing an end-to-end task. What's missing between where we are today and that vision in the next 6 months to 3 years?

Mustafa Suleyman20:57

So I think first of all, I don't think we're on a path towards fully autonomous, and I think that's actually quite undesirable. I think fully autonomous is quite dangerous. I got a lot of stick after my TED talk because I said that the autonomous capability was dangerous and that it should be regulated, and I don't really care. I still think that if you have an agent that can formulate its own plans, come up with its own goals, acquire its own resources, just objectively speaking, act completely independently of a human, that is going to be more potentially risky than not. So I think about it as these narrow veins of autonomy where you give it a specific goal and it has limited degrees of freedom to go off and act in some specific environment. So like making an API call to check some registry to collect some information, to observe some state, maybe writing something into a third-party API that is not yours but is again restricted with some specific degrees of freedom, because I think the security risks here are significant. So yeah, I just think we should tread carefully on the autonomous piece. In terms of the actions piece, it's still pretty hard to get these models to follow instructions with subtlety and nuance over extended periods of time. I think that they can do it, and there's a lot of cherry-picked examples that are impressive on Twitter and stuff like that, but to really get it to consistently do it in novel environments is pretty hard. And I think that it's going to be not one but two orders of magnitude more computation in training the models. So not GPT-5 but more like GPT-6 scale models. So I think we're talking about 2 years before we have systems that can really take actions.

Interviewer22:50

That makes sense. And if you were to break down some of the unsolved research or technical problems to get there?

Mustafa Suleyman22:57

Well, an action is no different really to predicting a sequence of words. So when you ask a model to complete a sequence of actions, let's say to book a restaurant that you and I can go to on a certain day. The first action would be check the availability in both of our calendars. So that's a correct function call. Reconcile the correct moment. So that's the second action. Make sure that it's a restaurant that has availability. So that check is another one. And then go and sign in so that you can use the correct tool to book the right restaurant at the right time. Put your credit card details down, having also checked that it's a restaurant that we both like, etc. So it's like four or five or six different steps just to produce that one action. In order to get that right, you're basically saying that the model has to produce perfect function calling for each element and do so in sequence. So it can't just be arbitrary. It has to be in sequence. And that's like saying it has to write a four-page document in response to one question that is exactly that document, and it can't be something that is approximate or similar to that document. So we all think that these models are magic at the moment and they write beautiful poetry and creative copy and text and give you good answers, and sometimes they're grounded, etc. But for each one of those answers, there's a wide range of correct answers that it could have picked, tens, hundreds, thousands maybe. So it isn't producing a specific perfect answer where every single token that is output is the correct answer for each one. It's not there yet. So to get that level of precision, we have to scale up these two orders of magnitude. That's what's happened so far. The last five orders of magnitude of transformers, with every 10x of compute and data, we get more precision. It's not just emerging capabilities. That's wrong. People say, oh, it's surprising we had these emerging capabilities. That's an anthropomorphic projection. They're not surprising emerging capabilities. It is just more precise attention to the correct mapping between a prompt and an output. You're just honing in on something more specific.

Interviewer25:24

Now do you think we'll be able to get narrow forms of actions in specific domains before we get to GPT-6?

Mustafa Suleyman25:32

Yeah, definitely. There are some good actions today. You can see these orchestrators making good API calls at the right time. The question is, can it do it with 99% accuracy? Because if it does it 80%, then one in five times it's getting it wrong, it's not usable for a consumer. So you either have to constrain the action state so that at every time you're asking your model to take an action, it's only got five options to pick from and the consequences of it getting it wrong are low, or you have to find a problem domain where four out of five accuracy is acceptable.

Interviewer26:08

Right. And if you think about the architecture of building one of these agents, are there any differences to what we had just talked about in terms of focusing on post-training, getting the UI out of the way, in terms of the architecture of building one of these narrow autonomous action agents?

Mustafa Suleyman26:24

One thing for people to keep in mind, I think, is that there's a whole range of tools in the toolbox now. And so the art is in designing a router or a classifier that can take some given input, either contextual information, metadata, or of course the incoming query from the user, and redirect that to a model that is appropriate to that context. And that's important for inference budget management, because obviously you can redirect the query to smaller, cheaper models or higher quality models or models that are specialized in a particular domain that have been fine-tuned for a particular area of expertise, or indeed have certain capabilities like they might be good at retrieval. They can retrieve from some knowledge base or even from the open web, or they might trigger a model that has been fine-tuned for a voice response, because obviously the length and style of a voice response would be quite different to one that was producing paragraphs and paragraphs in a traditional sense. So the router is a critical part of the architecture as well.

Interviewer27:29

Yeah. So Mustafa, I'm curious if you were starting your first company today, given everything that exists and the crazy rate of change, where would you focus your attention?

Mustafa Suleyman27:41

I would look for problem domains that make a virtue out of imprecision. So what is a problem where, if you solve that problem, the valuable contribution that you've made is that ambiguity, imprecision, multiple possible answers are the key thing? If you pick a problem domain where the consequences are really high if you get it wrong and where there's really only one or two correct answers, then your model is going to struggle. So that's the first thing I would say: look for more of those things.

Interviewer28:11

And just on that point, there is a ton of activity in domains that do require more precision, like law or accounting or tax. Do you think that is a futile effort at the moment?

Mustafa Suleyman28:23

Actually, law doesn't require as much precision as taking actions. Even in law, most of the applications are retrieving similar cases or giving you summaries of cases, where there are five possible summaries, all of which could be correct, or where you're retrieving one case over another, you don't have to. So the law thing is actually a high-stakes domain because the consequences if it gets it wrong are really bad, but unlike generating marketing copy, there's a lot of right answers in that domain. Medical is much harder. Clearly there are fewer right answers in medical, and the consequences are really high. So that's a pretty difficult domain. A bunch of my really good friends from DeepMind who have now been at Google just released a paper last week. Unbelievable work that they have done showing that they can provide an amazing reasoning engine for clinicians, and I think in the future for patients too. That's coming soon.

Interviewer29:27

What other factors would you think about?

Mustafa Suleyman29:31

I would say where you can design an interface which naturally collects valuable label data for fine-tuning by virtue of the interface. That's really important because if you're successful, you want to be able to compound that success. The more users you get, the more data you get, the higher quality model you can produce, and then you get that virtuous cycle. That's a really important part of it. And then it sounds obvious, but I think a domain where you can monetize much faster than you might think, because you need to get people to pay for it pretty fast because GPUs are wildly expensive, as everyone knows.

Interviewer30:11

What's an example of that that comes to mind for you?

Mustafa Suleyman30:13

I think the companies that are doing specialist services for not quite 10,000 fans, but like 10,000 true fans, people that really do need that kind of niche, highly adapted expert system in your pocket. I don't know whether it's a mechanic or a dentist or someone who's passionate about a certain hobby or a chunk of IP. I think there's value in those sorts of specialist use cases that people would be prepared to pay for.

Interviewer30:46

Yeah, that makes sense. Curious to hear a little bit about the products, the AI products you're working on at Microsoft, what the portfolio looks like.

Mustafa Suleyman30:53

Yeah, so I'm responsible for Bing, Edge the browser, and all of Copilot, which is deployed in basically every Microsoft surface now. And it's been an impressive arrival actually. The quality of products and their scale and reach is much greater than you might think as a default Silicon Valley person who had grown up in Google.

Interviewer31:20

Well, you don't become a $3 trillion company for nothing.

Mustafa Suleyman31:25

Yeah. But the reputation that we give it in Silicon Valley relative to what it has is, I think, needs a rethink. And also just huge scale and distribution. My main goal is to uplevel the quality of Copilot. And so we're rapidly building some of the best models in the world, partnering very closely with OpenAI, building on top of all of OpenAI's models and infrastructure, fine-tuning their models. And the next phase is that we're really going to start focusing on memory and personalization. I mean, your AI should remember everything about you, all your context, all your personal data, everything that you've said, and be there to support you and be your aid and your sidekick throughout your life. So that's what we're going to be focused on.

Interviewer32:06

That's fascinating. I'm curious, how do you think about the constraint of the existing applications in Microsoft Office versus the ideal version of Copilot?

Mustafa Suleyman32:15

Yeah, that's a good question. People have often said that an AI subsumes all other interfaces and surfaces. And I think that probably overstates it, but it's the right direction. I think there will come a time in a few years when the first thing you think is you just say, 'Hey, Copilot, can you take care of this for me? What's the answer to that? Where do I find it? Can you book this? Remember that? Buy this. Do that.' You're just going to have this ever-present aid in your life. It's going to change what it means to use a keyboard. It's going to change what it feels like to have apps. It'll move us way beyond the search engine and the browser. And you're certainly not going to think, 'I need to go write a document or send a message in a traditional way.' You'll still have those things, but your AI will just manage a canvas of activity across your entire life and largely be coordinating with other AIs and other services and collecting information for you.