How well does your LLM do?
You go to a shop and buy three things. You work out the cost as being £5.88. When you get to the till you realise that instead of adding the items together on your calculator you actually pressed multiply each time instead. Interestingly the cost at the register also comes out to £5.88. What was the cost of the three items.
Try asking your preferred LLM to solve this? How did it do? What's the right answer here? If you are treating it as actual cents though, I don't think there's an exact answer as a/100,b/100 and c/100 will come out of the factors of 588 and I just don't see getting enough big numbers to add up to 588. IMHO, there are multiple solutions. Some calculators allow you to run in fixed precision mode, which would ignore the 0.0007 error. Microsoft Copilot says: "Interesting puzzle! Let's work through it step-by-step:" then works through it with the kind of faulty thought process that I'd expect from a student who squeaks through an intro physics class and gets a D grade. It give me 2.7, 1.1, and 0.22 which don't multiply or add to 5.88. I told it was wrong, it gave me some numbers that summed right but multiplied wrong, pointed that out and it gave 1.20, 1.50, 3.18 which sums right and multiples to 5.724 which is a little low. That's not the response when I tried it with Microsoft Copilot but then again I do prefer noise and lots of it. QwQ Preview 32B Q5_K_M: Tried a couple of techniques, then decided to set C to 1.00 and used the quadratic formula to get 2.17 and 2.17 for the other values. Phi-4 (unsloth's dynamic 4-bit version) keeps trying random and repeating numbers with no particular strategy. Adding Maharshi's "contemplative prompt" did not really change anything. Oh wow, I just got an answer from Smallthinker 3B Preview. It talked for a long time but after 6,900 tokens, it eventually got the 1.00, 2.17, 2.71 answer. No system prompt, temperature 1.00 (which is probably too high, but I didn't notice until the test was over). R1 got it correct in one shot, I just pasted the statement. I'm not so sure, seemed to me like it confuses itself quite a bit.
is terribly undetermined and has a huge number of possible answers if the domain is the reals. If you postulate that a=1 then b=2.71 and c=2.17 is right to within rounding error (adds right, multiples to 5.8807) a + b + c = 5.88
abc = 5.88