Settings

Theme

Polish top-performing language for complex AI tasks, finds study

notesfrompoland.com

2 points by thombles 2 months ago · 1 comment

Reader

curioussquirrel 2 months ago

Interesting experiment, but I'd say aggregating the scores across models is far from ideal. Gemini 1.5 Flash got close-to-perfect scores on most languages (probably boils down to small variances in temp/top_k and statistical error). Small models are generally quite bad at non-English languages and tank the overall performance.

BTW, newer generations of models seem to have made some real progress in multilingual performance.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection