Two AI systems from Google DeepMind together solved four of the six problems in this year's International Mathematical Olympiad — on par with silver medalists in the annual world math championship for high school students.
Why it matters: The ability to solve a range of math problems in step-by-step proofs is considered a "grand challenge" in machine learning and has been beyond the reach of current state-of-the-art AI systems.
- "These are extremely hard mathematical problems and no AI system has ever achieved a high success rate in these types of problems," Pushmeet Kohli, vice president of research focused on AI for science at DeepMind, said in a press briefing.
How it works: AlphaProof teaches itself by trial-and-error — without human intervention — in what's known as reinforcement learning. The approach powered DeepMind's Go-mastering AlphaGo, Starcraft-crushing AlphaStar and other AI systems developed by the company.
- The team first fine-tuned Google's Gemini model to translate 1 million mathematics problem statements from English into a programming language called Lean, Thomas Hubert, a research engineer at DeepMind, said in the briefing.
- The problems, which ranged in difficulty, were then given to AlphaProof so it could generate potential solutions that it then checked against possible proof steps.
- Those that worked are then fed back into the model, which then improves as it attempts more problems.
AlphaProof was able to solve three of this year's math Olympiad problems — two algebra problems and one in number theory.
- One was solved in minutes and the others took up to three days. (Students have two 4.5 hour sessions to submit answers.)
- It couldn't crack two combinatorics problems.
The other member of the AI team, AlphaGeometry 2, solved the competition's geometry problem in 19 seconds.
- There is very little data available to train math-focused AI models, so the DeepMind team used synthetic data generated by AI itself to train AlphaGeometry 2.
- The system can solve 83% of math Olympiad problems from the past 25 years compared to its predecessor that could solve 53%, the company said.
Overall, the AI systems scored 28 out of 42 possible points — putting them in silver-medal territory and one point shy of the gold-medal threshold, the company said.
- At the competition last week, 58 of the 609 high school contestants from around the world were gold medalists.
What they're saying: Fields medalist Timothy Gowers, a mathematician at Collège de France, was one of the judges that checked the AI's work.
- He said he was surprised that the AI was sometimes able to come up with clever ideas for solving the problems — what he called "a magic key."
- "I find it very impressive and a significant jump from what was previously possible," Gowers said in the briefing, but he added further research is needed to analyze how the AI did it.
The big picture: At this point, the AI systems aren't adding to the body of mathematical knowledge that humans have created, said David Silver, DeepMind's vice president of reinforcement learning.
- "We're at the point where these systems can actually prove not open research problems but at least problems that are very challenging to the very best young mathematicians in the world," he said.