AI contributions to ErdΕ‘s problems

16 min read Original article β†—

This page collects the various ways in which AI tools have contributed to the understanding of ErdΕ‘s problems. Note that a single problem may appear multiple times in these lists. For further discussion of these contributions, see this forum.

Legend

  • 🟒 : full solution to the problem
  • 🟑 : partial solution to the problem
  • πŸ”΄ : failure to make progress on the problem

Disclaimers

One should interpret the contributions listed here with the following caveats in mind.

  • ErdΕ‘s problems vary widely in difficulty (by several orders of magnitude), with a core of very interesting, but extremely difficult problems at one end of the spectrum, and a "long tail" of under-explored problems at the other, many of which are "low hanging fruit" that are very suitable for being attacked by current AI tools. Unfortunately, it is hard to tell in advance which category a given problem falls into, short of an expert literature review. (However, if an ErdΕ‘s problem is only stated once in the literature, and there is scant record of any followup work on the problem, this suggests that the problem may be of the second category.) In particular, I would advise against direct comparison of raw counts of problems solved by one methodology or another, as this may not be an apples-to-apples comparison. In particular, one should not view this page as representing any sort of finely calibrated benchmark.
  • Many of the problems on the site currently lack a thorough literature review, and in the absence of such a review, any designation of a problem as "open" should be viewed as provisional. In particular, it has already happened multiple times that an AI tool has managed to solve a problem listed as open on the site, only to find shortly afterwards that the problem was in fact solved in the literature (though sometimes by a slightly different method); see Section 2 below.
  • The reporting of these AI tools on this page is highly incomplete, particularly with regards to negative results. One should keep selection bias in mind before drawing any conclusions about the success rate of these tools.
  • In a few cases ErdΕ‘s stated the problem incorrectly (or the problem was not stated properly on the web site), allowing for the problem as literally stated to be solved on a technicality, even though this was not the actual question ErdΕ‘s intended to ask. Working out what ErdΕ‘s actually intended in such cases requires a somewhat subjective contextual analysis guided by expertise in the subject.
  • If an ErdΕ‘s problem was first posed $N$ years ago, and was recently solved by an AI tool, it may be misleading to make announcements such as "this problem resisted all human efforts at proof for $N$ years" in order to imply that the problem is particularly difficult. Such an announcement may be technically true, especially if a comprehensive literature review turns up no significant prior literature on the problem, but it is also possible that this particular question received very little attention from human mathematicians during this time period, and the absence of any past progress may therefore be more a reflection of the obscurity of the problem than of its difficulty. (If on the other hand there is substantial literature expending significant effort on establishing partial results, without generating a clear path to a full solution, this is more convincing evidence of difficulty than mere absence of literature.)
  • The mathematical interest in these problems does not stem purely from their solution, but also on what insights and lessons the work on these problems sheds on related topics, and how the problem fits into the broader context of the field. When a solution is human-generated, such additional connections to the rest of the subject tend to be provided organically through the remarks and comments generated in the writing process (which often includes a literature review step). However, when a solution is primarily AI-generated, this useful penumbra of additional context may be largely absent, rendering the final product of less mathematical utility even if it does technically solve the stated problem. Thus, contributions to ErdΕ‘s problems should be evaluated holistically, taking into account the extent to which the results have been integrated into the existing web of knowledge around the problem, in addition to whether the problem itself was solved.
  • For the above reasons, not every solution to an open ErdΕ‘s problem would automatically qualify as a publishable paper in a reputable math journal, especially if the problem has not attracted significant prior literature, and the proof techniques ended up being minor modifications of existing methods. However, even when the problem solved was obscure and the proof was routine, it is still worth having the solution recorded somewhere in the published literature; the usual solution in such situations is to bundle such solutions as components of a larger paper which makes additional contributions such as a broader literature review, an exploration of the strengths and weaknesses of known methods (using such solutions as examples), and highlighting the open problems that continue to lie out of reach of these methods.
  • For additional guarantees of correctness, it is considered good practice to formalize AI-generated proofs in a proof assistant language such as Lean. However, there are still "exploits" available in such guarantees, for instance if the formal proof introduces additional axioms, or if there is a misformalization of the problem statement that may exploit some quirk of the math library or syntax used in the formalization. An AI-generated formal proof that is suspiciously short and elementary (or suspiciously verbose and trivial) is particularly subject to this issue, particularly in the hands of a user without extensive experience formalizing proofs by hand.

I just saw an exciting announcement on social media about AI being used to solve an ErdΕ‘s problem. Why can't I find out about it on this page?

If the social media post is very recent, then most likely the claim is undergoing an expedited, but not instantaneous, peer review over at the ErdΕ‘s problem web site. The discussion page for the ErdΕ‘s problem in question is most likely the best place to get the most recent updates, and in particular to perform the due diligence needed to take into account the caveats listed above; until then, the problem may be placed in the "pending assessment" category below. Once the discussion on that page has reached consensus, this contribution will most likely be moved from "pending assessment" to a different category on this site; but until then, I recommend taking any bold claims about this achievement with a grain of salt, and check back in a couple hours or days for a more accurate update.

1. AI-generated solutions, partial solutions, or negative results for previously open problems

Problem AI tools used Date Outcome
[36] AlphaEvolve 3 Nov, 2025 🟑 Slight improvement to past construction
[51] ChatGPT (free version) 11 Jan, 2026 πŸ”΄ Incorrect proof found
[52] AlphaEvolve 3 Nov, 2025 πŸ”΄ Did not match past constructions
[64] AlphaEvolve 3 Nov, 2025 No counterexample found
[67] AlphaEvolve 3 Nov, 2025 πŸ”΄ Did not match past constructions
[106] AlphaEvolve 3 Nov, 2025 Matched past construction
[124] Aristotle 29 Nov, 2025 🟑 Partial result (Lean)
[205] Aristotle, ChatGPT 5.2 Thinking 10 Jan, 2026 🟒 Full solution (Lean)
[391] AlphaEvolve 3 Nov, 2025 πŸ”΄ Did not match past constructions
[477] AlphaProof 7 Jan, 2026 🟑 Proof of variant problem (Lean)
[486] ChatGPT 5.2 11 Jan, 2026 🟑 Cheap counterexample to previous formulation of the problem
[493] AlphaEvolve 3 Nov, 2025 No counterexample found
[507] AlphaEvolve 3 Nov, 2025 🟑 Surpassed some past constructions
[728] Aristotle, ChatGPT 5.2 Pro 6 Jan, 2026 🟒 Full solution (Lean), using arguments similar to 🟑 Pomerance (2014)
[729] Aristotle, ChatGPT 5.2 Pro 8-10 Jan, 2026 🟒 Full solution (Lean), using a modification of the solution to #728
[949] AlphaProof 7 Jan, 2026 🟑 Proof of variant problem (Lean)
[1097] AlphaEvolve 3 Nov, 2025 🟑 Slight improvement to past construction

2. Fully AI-generated solutions to problems thought open, but for which an earlier human solution was subsequently found

Problem AI tools used Date Outcome Literature result Literature result found on Similar proofs?
[333] ChatGPT 5.2 Pro, Claude Opus 4.5 25 Dec, 2025 🟒 Full solution (Lean) 🟒 ErdΕ‘s and Newman (1977) 25 Dec, 2025 Yes
[397] Aristotle, ChatGPT 5.2 Pro 10 Jan, 2026 🟒 Full solution (Lean) 🟑 Elkies (2013) [likely upgradeable to full solution] 10 Jan, 2026 No
[659] Gemini 3.0 13 Jan, 2026 🟒 Full solution (Lean) 🟑 Moree and Osburn (2006), 🟑 Sheffer (2014) [upgradeable to full solution] 13 Jan, 2026 Yes
[897] Aristotle, Archivara 26 Dec, 2025 🟒 Full solution (Lean) 🟒 Wirsing (1981) 26 Dec, 2025 Yes
[1026] Aristotle 7 Dec, 2025 🟒 Full solution (Lean) 🟒 Tidor, Wang, and Yang (2016) 8 Dec, 2025 Only after applying an argument from Seidenberg (1959)
[1077] Aristotle 24 Dec, 2025 🟑 Trivial counterexample to previous formulation of problem (Lean) 🟒 Jiang and Longbrake (2025) 28 Dec, 2025 No

3. AI-generated tools applied to problems previously known to be solved (or partially solved)

Problem AI tools used Date Literature result Outcome
[43] Aristotle 4 Dec, 2025 🟑 Barreto (2025) 🟑 New proof of partial result (Lean)
[124] Gemini DeepThink 30 Nov, 2025 🟑 Aristotle (2025) πŸ”΄ Did not reproduce a known partial result despite hints
[198] AlphaProof 2025 🟒 Baumgartner (1975) 🟒 New proof found
[264] Aristotle 18 Dec, 2025 🟑 Kovač and Tao (2024) 🟑 New proof of partial result (Lean); πŸ”΄ no progress on remaining open problems
[379] Seed-Prover 1.5 21 Dec, 2025 🟒 Cambie, Kovač and Tao (2025) 🟒 New proof found
[392] GPT-5.2 Pro, Aristotle 31 Dec, 2025 🟒 ErdΕ‘s and Graham (unpublished, 1980), Cambie (2025) Unable to reconstruct proof
[488] Aristotle 27 Nov, 2025 🟑 Cambie (2025) 🟑 New disproof of alternate version (Lean)
[493] SeedProver, Aristotle, ChatGPT 2025 🟒 Seamans (2025) 🟒 New or existing proof found (Lean)
[679] Aristotle 12 Jan, 2026 🟑 DottedCalculator (2025) 🟑 Improved proof by reducing dependence on prime number theorem
[871] Claude Opus 4.5, Gemini 3 Pro 5 Jan, 2026 🟑 ErdΕ‘s and Nathanson (1989) 🟒 Existing partial result upgraded to full solution (Lean)
[942] Gemini 23 Nov, 2025 🟑 ErdΕ‘s (unpublished, 1976) 🟑 New proof found
[958] Aristotle 27 Dec, 2025 🟒 Clemen, Dumitrescu, and Liu (2025) 🟒 New proof found (Lean)
[1043] Aristotle 28 Dec, 2025 🟒 Pommerenke (1961) 🟒 New proof found (Lean)
[1043] ChatGPT 30 Dec, 2025 🟒 Pommerenke (1961), Aristotle 🟒 New proof found
[1095] Aristotle 30 Dec, 2025 🟑 Ecklund, ErdΕ‘s, and Selfridge (1975) 🟑 New proof of (slightly weaker) partial result found (Lean); πŸ”΄ no progress on remaining open problems

4. Solutions generated by humans in collaboration with AI

Problem Human collaborators AI tools used Date Outcome
[367] Boris Alexeev, Wouter van Doorn, Terence Tao Gemini Deepthink, Aristotle 20-22 Nov, 2025 🟑 Partial result
[401] Boris Alexeev, Kevin Barreto, Leeham, Nat Sothanaphan Aristotle, ChatGPT-5.2 Pro 10-11 Jan, 2026 🟒 Counterexample to alternate formulation; full solution to revised formulation (Lean)
[848] Mehtaab Sawhney, Mark Sellke GPT-5 12 Oct-20 Nov, 2025 🟒 Full solution
[1026] Boris Alexeev, Stijn Cambie, Terence Tao, Lawrence Wu ChatGPT, Aristotle, AlphaEvolve, Gemini 8 Dec, 2025 🟒 Full solution (Lean)
[1038] Junnosuke Koizumi, jspier, Nat Sothanaphan, Terence Tao ChatGPT, AlphaEvolve In progress 🟑 Partial result

5. Problems with an AI-powered literature review

Problem AI tools used Review performed on Outcome
[35] GPT-5 13 Oct, 2025 🟑 Partial results found
[66] GPT-5 13 Oct, 2025 🟑 Partial results found
[94] GPT-5 2 Nov, 2025 🟒 Full solution found
[124] ChatGPT DeepResearch, Gemini DeepResearch 30 Nov, 2025 No significant results found
[167] GPT-5 12 Oct, 2025 🟑 Partial results found
[188] GPT-5 13 Oct, 2025 🟑 Partial results found
[203] ChatGPT DeepResearch, Gemini DeepResearch 19 Oct, 2025 No significant results found
[205] ChatGPT DeepResearch 10 Jan, 2026 No significant results found
[223] GPT-5 13 Oct, 2025 🟒 Full solution found
[248] Gemini DeepResearch 19 Oct, 2025 No significant results found
[330] ChatGPT DeepResearch, Gemini DeepResearch, Claude 19 Dec, 2025 πŸ”΄ Partial results found (with inaccuracies); did not find literature proof
[333] ChatGPT 5.2 Pro 25 Dec, 2025 πŸ”΄ Incorrect proof claimed; did not find literature proof
[334] Gemini DeepResearch 19 Nov, 2025 No significant results found
[339] GPT-5 11 Oct, 2025 🟒 Full solution found
[347] ChatGPT DeepResearch 25 Oct, 2025 🟒 Full solution found
[354] ChatGPT DeepResearch 19 Oct, 2025 🟑 Partial results found
[367] ChatGPT DeepResearch, Gemini DeepResearch 22 Nov, 2025 πŸ”΄ No significant results found; did not find ErdΕ‘s problems community proof
[370] ChatGPT DeepReseach, Gemini, Gemini DeepResearch 17 Oct, 2025 🟑 Problem found to be misstated; solutions to variants found
[387] ChatGPT DeepResearch 1 Nov, 2025 🟑 Partial results found
[397] ChatGPT DeepResearch 10 Jan, 2026 🟒 Full solution found
[401] Claude Jan 10, 2026 No significant results found
[421] ChatGPT DeepResearch, Gemini DeepResearch 18 Oct, 2025 No significant results found
[434] GPT-5 29 Oct, 2025 πŸ”΄ Literature proof not found
[481] ChatGPT DeepResearch, Gemini DeepResearch 1 Dec, 2025 🟑 Unwittingly reproduced existing proof
[481] ChatGPT 3 Dec, 2025 🟒 Full solution found
[494] GPT-5 13 Oct, 2025 🟒 Full solution found
[515] GPT-5 15 Oct, 2025 🟒 Full solution found
[516] ChatGPT DeepResearch, Claude, Gemini DeepResearch 28 Dec, 2025 🟒 Problem found to be misstated; full solution to actual problem found
[524] ChatGPT 5.2 Pro 27 Dec, 2025 🟑 Partial results found
[559] ChatGPT DeepResearch, Gemini DeepResearch 26 Oct, 2025 🟑 Partial results found
[621] GPT-5 13 Oct, 2025 🟒 Full solution found
[645] GPT-5 20 Oct, 2025 🟒 Full solution found
[659] Gemini DeepResearch 13 Jan, 2026 🟑 Partial solution (upgradeable to full solution) found
[686] ChatGPT, Gemini DeepThink, Gemini DeepResearch 18 Oct, 2025 No significant results found
[689] ChatGPT DeepResearch, Gemini DeepResearch 29 Oct, 2025 No significant results found
[672] ChatGPT 7 Jan, 2025 🟑 Partial results found
[700] ChatGPT DeepResearch 17 Dec, 2025 No significant results found
[707] ChatGPT 22 Oct, 2025 πŸ”΄ Literature proof not found
[728] ChatGPT DeepResearch 15 Dec, 2025 No significant results found
[729] ChatGPT DeepResearch Jan 10, 2026 🟑 Related results found
[737] GPT-5 30 Sep, 2025 🟒 Full solution found
[750] GPT-5 13 Oct, 2025 🟑 Partial results found
[786] ChatGPT DeepResearch, Gemini DeepResearch 18 Oct, 2025 No significant results found
[788] GPT-5 14 Oct, 2025 🟑 Partial results found
[793] ChatGPT DeepResearch 30 Nov, 2025 🟑 Partial results found
[811] GPT-5 13 Oct, 2025 🟑 Partial results found
[822] GPT-5 13 Oct, 2025 🟒 Full solution found
[827] GPT-5 14 Oct, 2025 🟑 Partial results found
[829] GPT-5 13 Oct, 2025 🟑 Partial results found
[871] ChatGPT DeepResearch 6 Dec, 2025 🟑 Partial results found
[903] GPT-5 14 Oct, 2025 🟒 Full solution found
[906] ChatGPT DeepReseach, Gemini DeepResearch 17 Oct, 2025 No significant results found
[915] ChatGPT DeepResearch 26 Oct, 2025 🟑 Partial results found
[940] ChatGPT DeepResearch 24 Oct, 2025 No significant results found
[942] ChatGPT DeepResearch 23 Nov, 2025 🟑 Partial results found
[965] ChatGPT 2 Jan, 2026 🟒 Full solution found
[967] GPT-5 29 Oct, 2025 No significant results found
[990] ChatGPT DeepResearch, Gemini DeepResearch 20 Oct, 2025 🟑 Partial results found
[1002] ChatGPT DeepResearch, Gemini DeepResearch 3 Nov, 2025 No significant results found
[1008] GPT-5 29 Sep, 2025 🟒 Full solution found
[1011] GPT-5 13 Oct, 2025 🟑 Partial results found
[1016] ChatGPT DeepResearch, Gemini DeepResearch 18 Oct, 2025 🟑 Partial results found
[1019] ChatGPT DeepResearch, Gemini DeepResearch 18 Oct, 2025 🟑 Solution claim found, but precise citation not found
[1022] ChatGPT DeepResearch, Gemini DeepResearch 4 Dec, 2025 No significant results found
[1038] Gemini DeepResearch 16 Nov, 2025 No significant results found
[1041] GPT-5 16 Nov, 2025 No significant results found
[1043] GPT-5 12 Oct, 2025 🟒 Full solution found
[1079] GPT-5 13 Oct, 2025 🟒 Full solution found
[1099] ChatGPT DeepResearch, Gemini, Gemini DeepResearch 19 Oct, 2025 🟒 Full solution found
[1124] ChatGPT 31 Dec, 2025 🟒 Full solution to proposed variant found

6. Proofs that were formalized by AI

Problem Proof to be formalized by AI tools used to formalize Formalized on Other known proofs
[26] 🟒 Ruzsa Aristotle 28 Dec, 2025
[31] 🟒 Lorentz (1954), Wouter van Doorn (2025) ChatGPT, Aristotle 24 Nov, 2025
[43] 🟑 Barreto (2025) Claude, Aristotle 21 Dec, 2025
[56] 🟒 Ahlswede-Khachatrian (1995) ChatGPT, Aristotle 25 Nov, 2025
[105] 🟒 Xichuan (2025) ChatGPT Pro, Aristotle 17 Nov, 2025
[106] 🟑 Baek, Koizumi, and Ueoro (2024) Aristotle 10 Dec, 2025
[189] 🟒 Kovač (2023) Gemini 3 Pro, Aristotle 17 Dec, 2025
[198] 🟒 Baumgartner (1975) ChatGPT, Aristotle 24 Nov, 2025
[226] 🟒 Sato and Rankin (1974) Aristotle 29 Dec, 2025 🟒 Barth and Schneider (1970)
[229] 🟒 Barth and Schneider (1972) Aristotle 28 Dec, 2025
[246] 🟒 Birch (1959) Aristotle 28 Dec, 2025
[303] 🟒 Brown and Râdl (1991) SeedProver 21 Dec, 2025
[337] 🟒 Ruzsa and TurjÑnyi (1985) Aristotle 10 Dec, 2025
[350] 🟒 Ryavec (1974) ChatGPT, Aristotle 25 Nov, 2025
[367] 🟑 ErdΕ‘s problems community (2025) Aristotle 22 Nov, 2025
[370] 🟒 Steinerberger (2025) ChatGPT, Aristotle 24 Nov, 2025
[418] 🟒 Browkin and Schinzel (1995) ChatGPT, Aristotle 22 Nov, 2025
[480] 🟒 Chung and Graham (1984) Aristotle 28 Nov, 2025
[481] 🟒 Barreto (2025) Claude, Aristotle 1 Dec, 2025 🟒 Klarner (1982)
[499] 🟒 Marcus and Minc (1962) Aristotle 29 Nov, 2025
[541] 🟒 Grynkiewicz (2011) ChatGPT, Aristotle 30 Dec, 2025 🟒 Gao, Hamidoune, and Wang (2010)
[613] 🟒 Pikhurko (2001) ChatGPT Pro 4 Nov, 2025
[645] 🟒 Brown and Landman (1999) ChatGPT, Aristotle 23 Nov, 2025
[678] 🟒 Cambie (2025) Aristotle 7 Jan, 2026
[707] 🟒 Hall (1947) ChatGPT 23 Nov, 2025 🟒 Alexeev and Mixon (2025)
[845] 🟒 van Doorn and Everts (2025) Aristotle 8 Jan, 2026
[897] 🟒 ErdΕ‘s and Wirsing (1975) Aristotle 27 Nov, 2025
[958] 🟒 Clemen, Dumitrescu, and Liu (2025) Bytedance Seed AI4Math 2025
[967] 🟒 Yip (2025) Aristotle 19 Dec, 2025
[1000] 🟒 Haight ChatGPT, Aristotle 28 Dec, 2025
[1034] 🟒 Ma-Tang (2025) Aristotle 4 Dec, 2025
[1080] 🟒 De Caen and Székely (1992) Aristotle 28 Dec, 2025

7. Human solutions with some secondary AI tool use

Problem Result Date AI tools used Nature of tool use
[69], [248], [946] 🟒🟒🟑 Tao and Teravainen 1 Dec, 2025 ChatGPT Numerics and image generation
[114] 🟑 Tao 13 Dec, 2025 Gemini, AlphaEvolve Initial numerics, code and image generation
[682] 🟒 Gafni and Tao 8 Aug, 2025 ChatGPT Initial code generation

8. AI tools used to rewrite an existing argument

Problem AI tools used Date Previous argument
[728] Aristotle, ChatGPT 5-7 Jan, 2026 🟒 Barreto / ChatGPT / Aristotle (2026)

9. AI tools used to perform numerical exploration

Problem Date AI tools used Nature of tool use
[271] 11 Jan, 2026 Unspecified OEIS sequence generation
[872] 11 Jan, 2026 ChatGPT OEIS sequence generation

10. Pending assessment

Problem AI tools used Date Status
[None currently]