There is a competition called the International Math Olympiad (IMO). It’s an international competition for high school students. It consists of six questions that competitors have to answer within a time limit. There are only six questions, but these are extremely difficult questions, of course. At least at the level of excellent, but pre-university mathematicians.
Some AI developers have worked with the IMO to have their LLMs try the competition as well — the AIs are not competing with the humans, and in fact it’s not that kind of competition at all; if every entrant answers the questions they all win. This never happens, by the way.
The Google DeepMind AI team entered their Gemini Deep Think model, and it did very well, answering five out of six correctly, and earning gold medal status.
And here’s the thing, and the reason for the “it figures” title: OpenAI ran the same questions through one of its models, and say they got comparable results, so they “qualify for gold” as well. But they cheated. They just ran the test themselves, without participating with the IMO, and they just awarded themselves a medal. See? It figures.
