Polish top-performing language for complex AI tasks, finds study

2 points by thombles 5 months ago · 1 comment

Reader

Interesting experiment, but I'd say aggregating the scores across models is far from ideal. Gemini 1.5 Flash got close-to-perfect scores on most languages (probably boils down to small variances in temp/top_k and statistical error). Small models are generally quite bad at non-English languages and tank the overall performance.

BTW, newer generations of models seem to have made some real progress in multilingual performance.

Settings

Polish top-performing language for complex AI tasks, finds study

Keyboard Shortcuts