AI models achieved gold-level scores at the International Mathematics Olympiad, one of the world’s most prestigious and challenging math competitions for high schoolers. Though Google and OpenAI’s cutting-edge AI programs performed well, humans still dominated the competition.

AI vs. Human Mathematicians

Google Deepmind
Photo: Google DeepMind

Though you may think a computer has a big advantage in a math competition, complex mathematics still challenges modern AI models. This is because, once an AI model receives a prompt, it breaks the words and letters down into “tokens” to predict an appropriate response.

Whereas humans process words, sentences, and complex thoughts, AI will give an answer that’s the most likely string of “tokens”. AI models, therefore, lack the “logic” abilities needed to answer complex math prompts with one correct solution, according to Popular Science.

Open to students around the world, the 2025 competition had 630 human participants. Students took two four-and-a-half-hour tests in two days. The challenge involved 6 questions.

Explore Tomorrow's World From Your Inbox

Get the latest science, technology, and sustainability content delivered to your inbox.


I understand that by providing my email address, I agree to receive emails from Tomorrow's World Today. I understand that I may opt out of receiving such communications at any time.

Students were required to score 35 out of 42 possible points to win a gold title. Google’s DeepMind team and OpenAI both achieved gold medals at the 2025 competition.

According to Interesting Engineering, the AI models solved 5 out of 6 questions, along with sixty-seven human students, or 10% of competitors. Notably, whereas Google was invited by IMO to participate in the competition, OpenAI tested its model with the questions independently and declared its claim of a gold medal.

Both Google and OpenAI used models that aren’t publicly available in the competition. Researchers reportedly tested the questions using Gemini 2.5 Pro, Grok-4, and OpenAI 04—none scored above 13 points.

According to IMO president Gregor Dolinar, one of the most notable aspects of this year’s results was the ways in which the AI programs explained their “thought” process to arrive at each answer.

“Their solutions were astonishing in many respects. IMO graders found them to be clear, precise and most of them easy to follow,” Dolinar said via Google’s announcement.