I compared my daughter against SOTA models on math puzzles

blog.michalprzadka.com

15 points by przadka a year ago · 5 comments

Reader

przadkaOP a year ago

I tested o3, r1, 4o and other SOTA models against puzzles from an international math competition and compared their performance with my 11-year-old daughter's solutions. Full results include detailed conversations with each model and complete methodology.

ukituki a year ago

Interesting how the reasoning differs between models, e.g. DeepSeek trying the brute force tricks

michalwarda a year ago

Very cool post! I wonder how much will it affect the psychology of next generations.

Settings

I compared my daughter against SOTA models on math puzzles

Keyboard Shortcuts