Sam Altman0:00
And welcome to San Francisco. San Francisco has been our home since day one. The city is important to us and to the tech industry in general. We're looking forward to continuing to grow here. So we've got some great stuff to announce today, but first I'd like to take a minute to talk about some of the stuff that we've done over the past year.
About a year ago, November 30th, we shipped ChatGPT as a low-key research preview, and that went pretty well. In March, we followed that up with the launch of GPT-4, still the most capable model out in the world. And in the last few months, we launched voice and vision capabilities so that ChatGPT can now see, hear, and speak. And more recently, we launched DALL-E 3, the world's most advanced image model you can use, of course, inside of ChatGPT.
For our Enterprise customers, we launched ChatGPT Enterprise, which offers Enterprise-grade security and privacy, higher speed, GPT-4 access, longer context windows, a lot more. Today, we've got about 2 million developers building on our API for a wide variety of use cases, doing amazing stuff. Over 92% of Fortune 500 companies building on our products, and we have about 100 million weekly active users now on ChatGPT.
And what's incredible on that is we got there entirely through word of mouth. People just find it useful and tell their friends. OpenAI is the most advanced and the most widely used AI platform in the world now. But numbers never tell the whole picture on something like this. What's really important is how people use the products, how people are using AI. And so I'd like to show you a quick video.
Here, so we love hearing the stories of how people are using the technology. It's really why we do all of this. Okay, so now on to the new stuff, and we have got a lot. First, we're going to talk about a bunch of improvements we've made, and then we'll talk about where we're headed next. Over the last year, we spent a lot of time talking to developers around the world. We've heard a lot of your feedback. It's really informed what we have to show you today.
Today we are launching a new model, GPT-4 Turbo. GPT-4 Turbo will address many of the things that you all have asked for. So let's go through what's new. We've got six major things to talk about for this part.
Number one: context length. A lot of people have tasks that require a much longer context length. GPT-4 supported up to 8K and in some cases up to 32K context length, but we know that isn't enough for many of you and what you want to do. GPT-4 Turbo supports up to 128,000 tokens of context. That's 300 pages of a standard book, 16 times longer than our 8K context. And in addition to longer context length, you'll notice that the model is much more accurate over a long context.
Number two: more control. We've heard loud and clear that developers need more control over the model's responses and outputs, so we've addressed that in a number of ways. We have a new feature called JSON mode, which ensures that the model will respond with valid JSON. This has been a huge developer request. It'll make calling APIs much easier. The model is also much better at function calling. You can now call many functions at once, and it'll do better at following instructions in general.
We're also introducing a new feature called reproducible outputs. You can pass a seed parameter, and it'll make the model return consistent outputs. This, of course, gives you a higher degree of control over model behavior. This rolls out in beta today. And in the coming weeks, we'll roll out a feature to let you view log probs in the API.
All right, number three: better world knowledge. You want these models to be able to access better knowledge about the world. So do we. So we're launching retrieval in the platform. You can bring knowledge from outside documents or databases into whatever you're building. We're also updating the knowledge cutoff. We are just as annoyed as all of you, probably more, that GPT-4's knowledge about the world ended in 2021. We will try to never let that get that out of date again. GPT-4 Turbo has knowledge about the world up to April of 2023, and we will continue to improve that over time.
Number four: new modalities. Surprising no one, DALL-E 3, GPT-4 Turbo with vision, and the new text-to-speech model are all going into the API today. We have a handful of customers that have just started using DALL-E 3 to programmatically generate images and designs. Today, Coupang is launching a campaign that lets its customers generate Diwali cards using DALL-E 3. And of course, our safety systems help developers protect their applications against misuse. Those tools are available in the API.
GPT-4 Turbo can now accept images as inputs via the API, can generate captions, classifications, and analysis. For example, Be My Eyes uses this technology to help people who are blind or have low vision with their daily tasks, like identifying products in front of them. And with our new text-to-speech model, you'll be able to generate incredibly natural-sounding audio from text in the API with six preset voices to choose from. I'll play an example.
This is much more natural than anything else we've heard out there. Voice can make apps more natural to interact with and more accessible. It also unlocks a lot of use cases like language learning and voice assistance. Speaking of new modalities, we're also releasing the next version of our open-source speech recognition model, Whisper V3, today, and it'll be coming soon to the API. It features improved performance across many languages, and we think you're really going to like it.
Okay, number five: customization. Fine-tuning has been working really well for GPT-3.5 since we launched it a few months ago. Starting today, we're going to expand that to the 16K version of the model. Also starting today, we're inviting active fine-tuning users to apply for the GPT-4 fine-tuning experimental access program. The fine-tuning API is great for adapting our models to achieve better performance in a wide variety of applications with a relatively small amount of data. But you may want a model to learn a completely new knowledge domain or to use a lot of proprietary data. So today we're launching a new program called Custom Models.
With Custom Models, our researchers will work closely with a company to help them make a great custom model, especially for them and their use case, using our tools. This includes modifying every step of the model training process, doing additional domain-specific pre-training, a custom RL post-training process tailored for a specific domain, and whatever else. We won't be able to do this with many companies to start. It'll take a lot of work, and in the interest of expectations, at least initially, it won't be cheap. But if you're excited to push things as far as they can currently go, please get in touch with us, and we think we can do something pretty great.
Okay, and then number six: higher rate limits. We're doubling the tokens per minute for all of our established GPT-4 customers so that it's easier to do more, and you'll be able to request changes to further rate limits and quotas directly in your API account settings. In addition to these rate limits, it's important to do everything we can do to make you successful building on our platform. So we're introducing Copyright Shield. Copyright Shield means that we will step in and defend our customers and pay the cost incurred if you face legal claims around copyright infringement. And this applies both to ChatGPT Enterprise and the API. And let me be clear, this is a good time to remind people: we do not train on data from the API or ChatGPT Enterprise ever.
All right, there's actually one more developer request that's been even bigger than all of these, and so I'd like to talk about that now. And that's pricing. GPT-4 Turbo is the industry-leading model. It delivers a lot of improvements that we just covered, and it's a smarter model than GPT-4. We've heard from developers that there are a lot of things that they want to build, but GPT-4 just cost too much. They've told us that if we could decrease the cost by 20-25%, that would be great, a huge leap forward. I'm super excited to announce that we worked really hard on this, and GPT-4 Turbo, a better model, is considerably cheaper than GPT-4 by a factor of 3x for prompt tokens and 2x for completion tokens, starting today.
So the new pricing is 1 cent per thousand prompt tokens and 3 cents per thousand completion tokens. For most customers, that will lead to a blended rate more than 2.75 times cheaper to use for GPT-4 Turbo than GPT-4. We worked super hard to make this happen. We hope you're as excited about it as we are. So we decided to prioritize price first because we had to choose one or the other, but we're going to work on speed next. We know that speed is important too. Soon you will notice GPT-4 Turbo becoming a lot faster.
We're also decreasing the cost of GPT-3.5 Turbo 16K. Also, input tokens are 3x less and output tokens are 2x less, which means the GPT-3.5 16K is now cheaper than the previous GPT-3.5 4K model. Running a fine-tuned GPT-3.5 Turbo 16K version is also cheaper than the old fine-tuned 4K version. Okay, so we just covered a lot about the model itself. We hope that these changes address your feedback. We're really excited to bring all of these improvements to everybody now. In all of this, we're lucky to have a partner who is integral in making it happen.