ChatGPT vs. a specialized medical AI on 5 clinical cases (verbatim outputs)
wizey.oneAuthor here - we're the team behind Wizey, one of the two AIs in the comparison. A few things up front:
* Methodology was fixed before the runs.
* All outputs are quoted verbatim, including Case 2 (MGUS) where ChatGPT beat us cleanly.
* Panels are reconstructed from published case reports (Blood, Annals of Family Medicine, and others), so anyone can reproduce the experiment on Claude, Gemini, or Grok.
Full verbatim outputs for all five cases: https://wizey.one/blog/2026/04/17/wizey-vs-chatgpt-raw-exper...
Happy to answer anything on methodology or individual cases.
Do you have a medical doctor in the team?
Yes, we do. Dr Aigerim Bissenova, our CMO. Internal medicine residency at Mass General and digital health fellowship. She designed the panel selection and reviewed all outputs before publication