XentGame: Help Minimize LLM Surprise
xentlabs.aiInteresting I'm wondering if it is possible to cheat using LLMs!
It looks like you can (there are some LLM responses out there, e.g. Sonnet 3.5). Not clear if they can be super good at this, though.
Quite impressed by the entries of Claude
pretty cool, managed to get some reasonable score. I wonder if the high scores are close to the 'theoretical maximum', or if we are an order of magnitude below.
I don't think it's an order of magnitude below... though it's a bit hard to know for sure