Early LLMs had a long pre-training step that used a lot of compute so that the LLM would create a model of the world and capture all the information. Then at test time when we asked it a question, it would simply answer the question directly based on what it had learned. But now with O1, the LLM takes multiple steps to reason about its input and then come up with an answer. O1 will start with a relatively small number of reasoning steps, 10-20 steps, taking 15-20 seconds, but OpenAI plans to scale this up to hours, days, and weeks! Imagine asking an LLM to come up with a treatment for cancer, and then having it reason for weeks before coming up with an answer.
In terms of benchmarks, O1 smashes all the top complex benchmarks compared to GPT-4o and Claude Sonnet 3.5. Complex tasks here mean writing code, understanding and analyzing PRDs, reading medical reports, or writing novels. Basically, any task that requires critical thinking.
1.jpeg
On the other hand, O1’s basic capabilities are limited, and cambodia mobile database sometimes it performs even worse than GPT-4o when completing simple tasks such as writing personal information or editing a blog.
2.jpeg
How to try O1!
Now let's talk about how to use O1! Currently, ChatGPT Plus users can use O1 directly on chatGPT, but there are very strict rate limits.
O1-preview: 30 requests per week O1-mini: 50 requests per week
You could also look at O1 over Merlin Pro which has much better rate limits!