Which AI Calculates Taxes Correctly?

8 min read Original article ↗

Several readers asked me whether they could use AI to calculate taxes after reading my stories about AI helping me with home electrical and water heater problems in the previous post AI Gives Better Answers Than Google. They want a tax projection to help with setting tax withholding, paying estimated taxes, or planning for Roth conversions.

My immediate reaction was that calculating taxes isn’t the best use of AI, because it falls under “verifiable facts” and “latest development” categories. The retirement plan contribution limits, tax brackets, the maximum deduction amounts, etc., are all verifiable facts. The IRS sets them every year, and they are what they are. Just go to the IRS website for the latest numbers, or Google. If you ask AI, it had better get the latest numbers online anyway, because AI’s training lags.

On the other hand, tax rules are complicated. After you have all the latest numbers from the IRS, you still need to apply the complex rules — what counts and what doesn’t, and which rates apply to which income. Online tax calculators usually don’t cover the current year until late in the year, and they can be either too simple or too complicated. It would be nice to have AI calculate taxes specifically for the types of income and deductions we have: not too simple with only limited inputs, and not too complicated with everything under the sun.

The Quiz

I thought I would test how well AI calculates taxes. I came up with this quiz question:

Jill, single, age 63, has these incomes in 2026:

  • $30,000 from Social Security
  • $10,000 from interest and pre-tax IRA withdrawals
  • $37,000 from qualified dividends and long-term capital gains
  • $1,000 from muni bond interest

Jill contributes $4,400 to her HSA and donates $1,500 in cash directly to charities. What’s Jill’s federal income tax for the 2026 tax year?

I designed this question carefully to cover several calculations. Slightly less than 85% of Social Security is taxable in this case. AI has to know how to calculate the exact taxable amount, and not just make a straight 85% of Social Security taxable. The gross income consists of ordinary income, tax-exempt income, and investment income taxed at preferential rates. The ordinary income component goes across two tax brackets. So does the investment income. One deduction is above-the-line, and the other is below-the-line.

I added these instructions to encourage AI to get the latest tax numbers from the IRS:

It’s important to calculate it accurately. Please use tax amounts only for 2026 and only from official IRS sources.

Initial Results

First, I ran the test case through my tax calculator in Calculator: How Much of My Social Security Benefits Is Taxable?. It gave this result:

$25,410 of your Social Security benefits is taxable, which means 15% of your benefits is tax-free.

Your 2026 federal income tax is approximately $1,640.

This is calculated from a taxable income of $50,910 (AGI $68,010 – Standard Deduction $16,100 – Senior Deduction $0 – charitable contribution deduction $1,000).

Then I sent the quiz to all four major AI chatbots: ChatGPT, Claude, Gemini, and Grok. I only used the free version in each one, with the free Thinking, Pro, or Expert mode enabled. I compared the answers to the correct result from my tax calculator.

SourceAnswer
My tax calculator$1,640
ChatGPT 5.4 Thinking Mini$325
Claude Sonnet 4.6 Extended$1,910
Gemini 3 Pro$1,910
Grok 4.20 Expert$1,910

My quiz was too hard! All four major AI models failed to give the correct answer. The $1,500 cash donation threw them off. It falls squarely in the “latest development” category. The charity donation deduction for non-itemizers is new. The AI models all have “donations are deductible only when you itemize” imprinted in their training. Claude, Gemini, and Grok all would have given the correct answer if I hadn’t included the cash donation. ChatGPT was off on how much of Social Security is taxable.

When you see AI not giving you a deduction, you can raise a question and ask it to double-check. I followed up with this statement:

I heard that cash donations are deductible for people using the standard deduction, starting in 2026, up to a limit.

I only said “I heard,” which may or may not be true. AI still has to verify. I also added this request for ChatGPT:

Another AI model calculated a different amount for how much Social Security is taxable. Please double-check your calculation to see who’s right.

Round 2

All four AI chatbots double-checked and revised their answers.

SourceAnswer
My tax calculator$1,640
ChatGPT 5.4 Thinking Mini$1,640 ✅
Claude Sonnet 4.6 Extended$1,640 ✅
Gemini 3 Pro$1,411
Grok 4.20 Expert$1,411

ChatGPT and Claude got the correct answer. Gemini and Grok mistakenly treated the cash donation deduction as above-the-line and used it to reduce the Social Security taxable amount.

I asked Gemini and Grok to clarify whether the HSA contribution and the charity donation should or shouldn’t be included in calculating the Social Security taxable amount. Gemini didn’t realize its mistake. It said both should be included. Grok refused to go further because I reached the message limit for not having an account. [A reader with a Grok account did the same chat with Grok and told me that Grok also said both should be included.]

Here are the full chat transcripts. They make an interesting read.

https://chatgpt.com/share/69bf73fb-db44-8004-9905-b77b56fbb6ce

https://claude.ai/share/ef62408a-6502-4d70-82be-6ea46a4a4cc3

https://gemini.google.com/share/795bdad9f5ad

[No sharable link for Grok because I don’t have an account.]

Impressions

I don’t think we can draw definitive conclusions based on only one test. I call these impressions.

1. Whether AI can calculate taxes correctly depends on the complexity. Three out of the four major AI chatbots would’ve given the correct answer in one go if the question didn’t include the new and tricky charity donation deduction. The explicit request to retrieve the latest information from official IRS sources and turning on the Thinking/Pro/Expert mode probably helped in improving accuracy.

2. AI works better when you make it a conversation. Don’t stop and say it’s useless when you spot some errors in the initial answer. Help the tool help you. Both ChatGPT and Claude reached the correct answer in Round 2 after I asked them to double-check.

3. It takes little effort to send the same question to two or more AI models. You don’t need to know which AI is correct when you get different answers. You can tell one AI that another AI said something different. It’ll re-examine.

4. AI was wrong because human sources were wrong. Gemini and Grok treated the donation deduction as above-the-line because many human sources incorrectly called it above-the-line. If you Googled, you would encounter those incorrect sources too. ChatGPT and Claude used more reliable sources and correctly identified the difference between above-the-line and not requiring itemizing deductions.

5. An online tax calculator is faster and more accurate if you know which one to use. That’s a big “if.” Only saying “Don’t use AI because it’s often wrong” doesn’t say where you will find more reliable sources.

As much as I’d like to see everyone use my tax calculator, let’s face it: 99.99% of people don’t know that I exist. They have little chance of finding the few sources that do it accurately because Google doesn’t rank tax calculators by timeliness or thoroughness.

Here’s a small challenge for you:

Pretend you’re not familiar with tax calculations. Find another way to answer the quiz question. Don’t say “I have a spreadsheet,” or “I use the super-duper Case Study Spreadsheet or Excel1040.” 99.99% of people don’t have those. Please tell me in the comments section how you could do it otherwise.

Short of finding the good sources, the results from AI aren’t that bad (see #1).

6. AI’s general approach was correct even when it was wrong in some nuances. All four AI chatbots followed these correct steps:

  • Calculate how much of Social Security is taxable
  • Calculate AGI
  • Calculate taxable income
  • Separate ordinary income and preferential investment income, and know how they stack
  • Apply different tax brackets

This general approach is “common knowledge of an insider.” You can learn the steps from AI if you’re not familiar with tax calculations. That’s more valuable than just having a final number. When I saw that ChatGPT didn’t include details of some interim steps, I asked it to show its work. ChatGPT explained the details faster and better than I could (read the transcript).

An online tax calculator gives you a number but doesn’t explain the steps. AI is a better tool in teaching you how to fish than giving you a fish.

Learn the Nuts and Bolts

I put everything I use to manage my money in a book. My Financial Toolbox guides you to a clear course of action.

Read Reviews