OpenAI Fights Back Against DeepSeek AI With Early o3-Mini Launch—Here’s How It Compares


OPENAI rushed to defend its position on the market on Friday with the release of O3-Mini, a direct response to the R1 model of the Chinese startup Deepseek which sent shock waves via the AI ​​industry by making correspond high level performance to a fraction of the calculation cost.

“We publish Openai O3-Mini, the most recent and profitable model of our series of reasons, available both in Chatgpt and the API today,” said Openai in a civil servant blog. “Overview in December 2024⁠, this powerful and fast model advances the limits of what small models can achieve (…) while maintaining the low cost and the reduced latency of the Openai O1-Mini.”

OPENAI has also made the reasoning capacity available for free for users for the first time while tripling daily messages for paid customers, from 50 to 150, in order to stimulate the use of the new family of reasoning models.

Unlike GPT-4O and the GPT family of models, the “O” family of AI models focuses on reasoning tasks. They are less creative, but have integrated the reasoning of the chain of thought which makes them more able to solve complex problems, going back on erroneous analyzes and to build a better structure code.

At the highest level, Openai has two main families of AI models: pre-formed generative transformers (GPT) and “OMNI” (O).

  • GPT is like family artist: a type of right brain is good for role playing, conversation, creative writing, summary, explanation, brainstorming, cat, etc.
  • O is the family’s nerd. He fears telling stories, but is excellent for code, solving mathematical equations, analyzing complex problems, planning his reasoning process step by step, comparison of research articles, etc.

The new O3 Mini is available in three versions – the bottom, the means or the top. These subcategories will provide users with better answers in exchange for more “inference(Which is more expensive for developers who have to pay by token).

OPENAI O3-Mini, aimed at efficiency, is worse than OPENAI O1-Mini in general knowledge and the multilingual chain of thought, however, it marks better in other tasks such as coding or billing. All other models (O3-Mini Medium and O3-mini high) beat Openai O1-Mini in each reference.

Image: Openai

The breakthrough of Deepseek, which provided better results than the flagship model of Openai, while just using a fraction of the computing power, triggered a massive technological sale which suffered nearly 1 billion of dollars from the American markets. Nvidia alone loses $ 600 billion in market value while investors have questioned future demand for its costly AI chips.

The efficiency gap comes from the new Deepseek approach to model architecture.

While American companies have focused on the launch of more IT power to the development of AI, the Deepseek team found ways to rationalize how the models process information, making it more effective. Competitive pressure has intensified when the Chinese giant of Alibaba technology has published Qwen2.5 Max, an even more capable model than that of Deepseek used as a foundation, opening the way to what could be a new wave of innovation in innovation ‘Ia Chinese.

OPENAI O3-Mini tries to increase this gap again. The new model works 24% faster than its predecessor, and corresponds or beats older models on key references while costing less to work.

Its price is also more competitive. OPENAI O3 -Mini prices – $ 0.55 per million entry tokens and $ 4.40 per million production tokens – are much higher than Deepseek R1 prices of $ 0.14 and 2, $ 19 for the same volumes, however, they decrease the gap between Openai and Deepseek, and represent a major cut compared to the prices billed to execute Openai O1.

Image: Openai

And that could be the key to its success. OPENAI O3-minini is closed source, unlike Deepseek R1 which is available for free, but for those who wish to pay for use on accommodated servers, the call will increase according to the expected use.

OPENAI O3 Mini-Medium Scores 79.6 on the reference likes mathematical problems. Deepseek R1 marks 79.8, a score which is only beaten by the most powerful model of the family, Openai Mini-O3 High, which marks 87.3 points.

The same scheme can be observed in other benchmarks: GPQA marks, which measure competence in different scientific disciplines, are 71.5 for Deepseek R1, 70.6 for the O3-Mini of Bas and 79.7 for O3-mini high. R1 is at 96.3rd centile in Coded forcesA benchmark for coding tasks, while the O3-Min-Bas is at 93rd centile and o3-mini high is at 97th centile.

The differences therefore exist, but in terms of benchmarks, they can be negligible depending on the model chosen to perform a task.

Test Openai O3-Mini against Deepseek R1

We tried the model with a few tasks to see how it worked against Deepseek R1.

The first task was a spy game to test how good it was during the reasoning in several stages. We choose the same sample in the Big Bench data set Github that we used to assess Deepseek R1. (The full story is available here And implies a school trip in a distant and snowy place, where students and teachers face a series of strange disappearances; The model must find out who the harasser was.)

OPENAI O3-Mini has not succeeded well and reached the bad conclusions of history. Depending on the answer provided by the test, the name of the harasser is Leo. Deepseek R1 understood correctly, while Openai O3-Mini was wrong, saying that the name of the harasser was Eric. (Funny fact, we cannot share the link with the conversation because it was marked as dangerous by Openai).

The model is reasonably good in the tasks linked to the logical language which do not imply mathematics. For example, we asked the model to write five sentences that end with a specific word, and he was able to understand the task, to assess the results, before providing the final response. He thought about his answer for four seconds, corrected a bad answer and provided a response that was fully correct.

It is also very good in mathematics, which is capable of solving problems which are considered extremely difficult in certain references. The same complex problem that took Deepseek R1 275 seconds to be resolved was completed by Openai O3-Mini in just 33 seconds.

So a very good effort, Openai. Your movement deeply.

Edited by Andrew Hayward

Generally intelligent Bulletin

A weekly IA journey told by Gen, an AI generator model.

Leave a Reply

Your email address will not be published. Required fields are marked *