Settings

Theme

Accept my accept-language

blog.choibean.com

111 points by hc5 12 years ago · 75 comments

Reader

Udo 12 years ago

Pretty much every major site will completely ignore not only your browser's language settings but also onsite user account settings and other desperate attempts at selecting the language. The only thing that matters is your IP address.

For example, accessing Google: my browser is set to accept English only. I'm entering the English URL. In my account settings I periodically reset everything I can find to English (settings apparently decay, too). Google knows I want the English version. Yet, they still give me the interface in whatever language my IP address comes from. And not only the UI, search results as well.

Recently it's gotten even worse than that: Google figured out I'm actually German, so they start defaulting to German more often now - ignoring everything else. At least with the IP address-based routing it was impersonal.

I happened to be in Sweden when I linked my Facebook calendar to my Google calendar. Ever since that day, my friends' birthdays are given to me in Swedish. Facebook knows I want English, yet for some reason this is how it's got to be.

The same abuse is apparently considered best practice at new startups as well: recently I was testing a browser game for an acquaintance who's on their development team. Because I was in Portugal at the time, I of course got the site in Portuguese. Manually switching that to English, the game still started up in Portuguese. It's been doing that ever since. Every email I get from that company is in Portuguese, too, even though I tried everything I could to set my language to English.

It's a source of endless frustration, maybe even a hostile act. They're effectively saying "Your choices don't matter, we know what's best for you. You're from country X, so you _must_ speak Xish. People are on the internet to enjoy regional separation. Really, it's best."

  • morsch 12 years ago

    Go to http://www.google.com/ncr (once, it sets a cookie) to fix this at least for the search results.

  • nrser 12 years ago

    can anyone provide insight into the business reasoning behind this? i really can't conceive of why you would want to supersede a user's exact, known language with a guess. sites are pretty difficult to use when you can't read anything. maybe there are technical issues for some sites, but Google search is the worse i've ever dealt with, and they def have some resources behind that.

    • preinheimer 12 years ago

      Mostly it's because people don't know how to change that setting. Imagine walking up to a computer in a shared space (hotel lobby, library, etc.) and it's been configured to send out accept-language: <something you don't understand>.

      Many of the people reading Hacker News will be able to find and change that setting. My mom never will. She'll just know she went to google.com, and saw Chinese.

      If you're using a computer in a country, and websites seem to be showing you things in the language of that country, that's something you can probably understand. If you're using a computer and some websites insist on showing you some other language, you'll be confused.

      • ds9 12 years ago

        It's true that most people leave the defaults. However, there may an easy solution for a subset of cases.

        Do the browser defaults in any country include multiple language settings? It seems likely that in most countries, the default would be only one language. And if this is the case, then if multiple alternatives are present in the request headers, it's very likely the user or computer admin has deliberately changed it, and that in turn would mean that sites should respect the choices.

        This might still be wrong in when the settings were made by someone other than the current user, or there are multiple languages default-configured, but it might be a step in the right direction.

      • derefr 12 years ago

        > Imagine walking up to a computer in a shared space (hotel lobby, library, etc.) and it's been configured to send out accept-language: <something you don't understand>.

        If web browsers could somehow figure out they were running either under a guest/public-use account, or in a kiosk mode, they could avoid sending an Accept-Language header at all. Then,

        1. in cases where the header is sent, it would mean a lot more (and hopefully override both online-profile-stored and IP-detection-based answers);

        and 2. in cases where the header isn't sent, using an answer from an online profile setting or IP-detection would no longer be against-standard.

      • mook 12 years ago

        While that excuses ignoring the Accept-Language header, it doesn't make sense for overriding the user's explicit configuration in their profile; you wouldn't expect that to be shared.

        That, and I'd expect public computers to disallow that sort of configuration anyway, so it would be stuck at the default value, which should be sensible for the location it's set up in; it's not like it would move around...

    • thefreeman 12 years ago

      It sounds like it could be legal? I know different countries have different agreements / restrictions with google about search results.

    • nocoment 12 years ago

      Well, the Adword value of your visit is considerably higher in the primary language of your location.

  • kalleboo 12 years ago

    And don't forget that no matter what your IP, no matter what your language setting, no matter what country you have set, Google Maps will still default you to showing a view of the continental United States. You're in Tokyo, searching for Yokohama? Here's Yokohama Sushi in downtown Los Angeles. For a while, the new web Google Maps redesign even removed the HTML5 Location API "my position" button. It's back now, but it should be defaulting to showing your location, like the mobile apps do.

    • derefr 12 years ago

      The web version is actually localized by domain.

      http://maps.google.com is the US-local Google Maps; http://maps.google.ca defaults to Canada; http://maps.google.co.jp sends you to Japan; etc.

      • _delirium 12 years ago

        Huh, I didn't know that. But why geoIP-based redirects on some sites and not others? google.com redirects me to google.dk, and even blogspot.com redirects me to blogspot.dk, but maps.google.com doesn't take me to maps.google.dk.

        • derefr 12 years ago

          I believe the logic was that "google.com" always does the redirect, because people tend to type google.com manually into their address bar a lot. Google has never bothered to set up any other redirects itself.

          Other services that Google has acquired, though (e.g. Blogger, Youtube, etc.) may have come pre-set-up with redirection logic, and Google has mostly left that untouched.

rwg 12 years ago

You probably want something like "en-US,en;q=0.9,ko;q=0.8". (Note the addition of "en" between "en-US" and "ko".) Some quick testing with Firefox, which lets you directly alter the Accept-Language: header in requests in about:config, shows that fedoraproject.org has "en" versions of resources but not "en-US" versions. Since your Accept-Language: header only lists "en-US", "ko" ends up being selected.

EDIT: I just noticed your guesses at the bottom of the post. Your second guess is correct. See §14.4 of RFC 2616:

As an example, users might assume that on selecting "en-gb", they will be served any kind of English document if British English is not available. A user agent might suggest in such a case to add "en" to get the best matching behavior.

delroth 12 years ago

I just fixed dolphin-emu.org, this was a bug in our code that would not detect en-US as being "compatible" with en. See https://github.com/dolphin-emu/www/commit/ddef974c6f601bc2db...

As a French guy leaving in the German part of Switzerland with Accept-Language configured to get English content, I'm kind of ashamed to have that kind of bug in my language detection code. I'm always complaining about other websites language detection, looks like I should have looked at my own code first!

  • hc5OP 12 years ago

    Thanks! There is a corollary to this that would have prevented all this - when I went back in the Chrome settings and set the settings to the same order, it reset my header to this: "en-US,en;q=0.8,ko;q=0.6" - which makes things work for all sites again. I haven't touched my language settings since ~2012, so it's possible Chrome "fixed" this a while back, but didn't change my existing settings.

  • spacehunt 12 years ago

    Please don't do this... I detest sites that try to be clever and serve me Simplified Chinese even though I only have zh-hk in Accept-Language:.

    • delroth 12 years ago

      I already have exceptions for things like that. I think our code handles zh_{CN,TW,HK} separately, as well as things like pt_BR vs. pt.

          > curl -I -H 'Accept-Language: zh-hk,en;q=0.8' https://dolphin-emu.org/
          HTTP/1.1 200 OK  # No zh_HK translation (yet!)
      
          > curl -I -H 'Accept-Language: zh-cn,en;q=0.8' https://dolphin-emu.org/
          HTTP/1.1 302 FOUND
          Location: http://cn.dolphin-emu.org/?cr=cn
      
          > curl -I -H 'Accept-Language: pt,en;q=0.8' https://dolphin-emu.org/
          HTTP/1.1 200 OK  # No pt translation (yet!)
      
          > curl -I -H 'Accept-Language: pt-br,en;q=0.8' https://dolphin-emu.org/
          HTTP/1.1 302 FOUND
          Location: http://br.dolphin-emu.org/?cr=br
      
      i18n is hard but I think I've been doing a fairly good job on it. Proud to have more than 50% of our visitors from outside of the US!
      • spacehunt 12 years ago

        Having now read the full code and not just the diff, I have to say it looks pretty good. I note that plain "zh" is not redirected to the cn site. ;) Whether it should or not is debatable though -- I actually think ignoring "zh" altogether is a rather prudent move if it is intentional.

crazygringo 12 years ago

Language choices are a mess. There can easily (and often) be conflicting data based on:

- accept-language header

- URL that includes language/region codes as a subdomain or part of the path

- language preferences set in a cookie or account

- IP region detection

In the end, any website is trying to provide the right language most often for their users, and there are no easy answers. When I access webmail from an Internet cafe in China, I don't want the interface popping up in Chinese just because the browser's accept-language is configured for Chinese. Fortunately, it doesn't.

Most web users have never even heard of accept-language, it's just automatically configured by whatever language their browser was installed in, which isn't always the language you want to be browsing in. (E.g. you bought your laptop overseas because it was cheaper, so it runs in English instead of your own language.) It's not a surprise that IP address detection provides the best default experience most of the time, which can then be overridden by URL or user choice, and that accept-language is fairly irrelevant.

  • delroth 12 years ago

    What we've done for dolphin-emu.org:

    * In all cases, a fairly visible language picker is displayed at the top of the page, with internationalized language names.

    * If someone goes to a language-specific subdomain (fr.dolphin-emu.org, cy.dolphin-emu.org, ast.dolphin-emu.org, ...), they get this version.

    * If someone goes to the generic/english dolphin-emu.org, the system checks whether the user has a "nocr" cookie. If so, they get the english website. Otherwise, they get redirected based on their Accept-Language.

    * If a user uses the language picker, we assume they know what they want and set the "nocr" cookie to disable redirections in the future.

    * When the user gets redirected from the standard/english version to an internationalized version, a message is shown in english saying that they have been redirected based on their browser preferences, with a link to go back to the english version (and set the "nocr" cookie).

    I thought for a pretty long time about this and think it is a good compromise between providing the best version for our users and not being annoying/guessing too much. In the end, more than 50% of our users now are shown internationalized versions of our website, which is a very good number in my opinion.

    • yepguy 12 years ago

      This seems like a pretty good solution, except that your language picker includes country flags, which don't make sense for many users.

      • delroth 12 years ago

        They do make sense for many users, and they are the closest you can find to a proper graphical representation of languages. When I add a language that I know to be official in several countries, I look at my analytics to see where most users come from and use the flag from their country. I can't remember a time where it did not also match the country with the most speakers.

        • yepguy 12 years ago

          It's a common enough practice that most people usually know what it means, but there's a reason you don't see flags on Wikipedia, Facebook, or Youtube. Languages are spoken in many countries, and countries are multilingual. There are quite a few articles around the web on this topic, but that's basically what they boil down to: languages are not countries. Some users may be confused or offended that their flag is not represented.

          • nfoz 12 years ago

            And as a Canadian I find it generally a little weird that the Canadian flag often means Canadian French, and I have to click the US flag to get English (which is of course a slightly different English than Canadian English which is probably unavailable).

            I guess it's something like "language most unique to that country", no but that's not right either... I don't know.

      • ketralnis 12 years ago

        Unless you have different pricing per country or something orthogonal to language, I'm sure than a speaker of Canadian French can figure out that clicking the French flag may help them understand this page better. It's a common enough idiom on the web.

        • _delirium 12 years ago

          I think in the case of more than one country per language, you're right, just picking a big and/or well-known country as "representative" is fine: French flag for French, US or UK flag for English, German flag for German.

          The bigger problem is the other situation, of more than one language per country. India has ~13 languages with >10m native speakers, and using the Indian flag for all of them would be pretty confusing. You could pick state flags (e.g. the flag of Gujarat to represent Gujarati), but that can be a politically tricky issue. In some cases choosing a representative flag for a language has even stronger political overtones, like using the flag of the Kurdistan independence movement to represent the Kurdish language. Plus, it's not always that clear which flag to pick, and user recognition may not be as high as in the French-flag-for-French case.

raving-richard 12 years ago

Google is^H^Hwas* really bloody annoying when it comes to this. English (en-us and en) is the only language in my accept header. When I lived in Geneva though, Google always used to serve me pages in German (presumably Swiss-German). Gee, that's logical. (Geneva is a mainly French speaking city, though over 40% of the population is non-Swiss.)

Where I live now is another French speaking area. I just checked and it seems they are no longer serving French pages to me. But they were even just one year ago. (I don't use Google by default, so I don't know when they changed.)

Admittedly, that was an issue with geo-detecting rather than the website having bad language detection.

* They seemed to have stopped.

Air France is (though they have many faults) actually alright at detecting my language. And mostly gives me English pages...

  • tazjin 12 years ago

    My accept-language header only has en_GB and en in it. Google still randomly serves me pages in Swedish and German (which are both languages I speak, but which I both explicitly disabled in my Google account settings).

    The best case of this was when they launched the preview for the new Google Maps version - there was a landing page with some information and a button in the middle. This page was served to me in three languages at the same time (the header, the button and the info text) - presumably served by different internal components that all handle languages differently.

  • ronaldx 12 years ago

    Worse: Google went through a period of normally serving me French-language pages.

    I'm not in a French-speaking country. I don't have French in my accept header. I never expressed any preference towards the French language.

    But, my ISP was Orange (France Telecom) and I had a variable IP from them.

  • bhrgunatha 12 years ago

    I always have to open google.com/ncr in a separate tab which sets a session cookie (I don't accept permanent cookies from google.) I guess they've changed their logic for some places, just not where I am :(

  • mrweasel 12 years ago

    Google has become worse when it comes to language detection. I often get Brazilian Portuguese on Google Analytics, I'm in Denmark. I believe one of by co-worker often got Russian.

  • mschuster91 12 years ago

    IIRC google also "detects" language based on DNS geolocation, i.e. doing a dig google.com may reveal different IP addresses in every country (depending on the language).

    There are some IP addresses which, when viewed "raw" like http://aaa.bbb.ccc.ddd/ will return a localized Google.

darklajid 12 years ago

My accept header only contains en-US and en. I tend to get served German content (and Google's especially bad about this).

Please, I hope someone hears your complains and starts fixing things. That issue is highly annoying..

  • masklinn 12 years ago

    > and Google's especially bad about this

    Google is a Royal Pain in the Ass on this point. They completely disregard any request configuration and decide on output language based on IP geolocation (which is pretty much always Not What I Want, even more so in multilingual countries such as Belgium or Switzerland[0]), then Chrome "helpfully" suggests translating documents.

    [0] where it won't even send you something matching your actual geographical location's language, usually sending the country's most common language — dutch in Belgium and german in Switzerland

    • tonfa 12 years ago

      I live in Switzerland and google does follow my accept-languages (en-US,en;q=0.8,fr;q=0.6,de;q=0.4). When going to google.com in incognito I get google.ch in english which is what I asked.

      • _delirium 12 years ago

        I don't get that behavior in Firefox. I have 'en-US' and 'en' as my preferred languages (in that order), set via Preferences->Content->Languages. But when I go to google.com in incognito, I get google.dk in Danish.

        I guess English is preferred here commonly enough that it's at least listed as one of the two alternate google.dk languages in an easy-to-find place under the search box, along with Faroese. "Google.dk på: Føroyskt English". And if you click "English" it stays with it for the session. No luck if you wanted something else like German, though.

        • tonfa 12 years ago

          Ok, I think the trick is to have at least one more in addition to en-US+en (I have a couple more).

    • nolok 12 years ago

      This gets very fun once you start using a vpn from another country.

      • woodson 12 years ago

        Indeed, and sometimes strange things happen:

        ssh tunnel from Australia to server located in California. Geo IP tool [1] reports it's in Fremont, CA. But Google assigns Taiwanese locale..

        Outdated/erroneous geolocation database? or did I take a wrong turn somewhere ;-)

        [1] http://www.geoiptool.com

  • ZirconCode 12 years ago

    This. I travel a lot, I've had this happen too often. Sometimes they even lock payment methods based on where you are, it's horrible.

jtokoph 12 years ago

> 1. the default quality value is being parsed wrong, and English is being assigned q=0 instead of q=1 or

> 2. en-US doesn't match en and is being bypassed

Or: 3. They are simply checking your IP address and not looking at your header at all.

  • bhrgunatha 12 years ago

    This is what so many websites do now. It causes me constant aggravation. It's nothing to do with your browser settings. They infer your language from your IP address, and for most cases that IS the right thing to do. However for me it isn't. I really really wish there was a way to configure your browser to force websites to accept your language settings.

    The only other option is to enable cookies so that the website language choice is saved - which also invites countless tracking cookies which I do NOT want.

    Your web site does NOT know better than me which language I want to read.

    • hrktb 12 years ago

      Very minor point: the right thing to do is to not infer the language, specially not from the IP, if no explicit information is available. You shoukd make the user choose or take him/her to the default language (there should be an obvious way to change the language from there anyway)

      If you have multiple language, hopefully you already have a scheme to differentiate the language (i.e. wikipedia has the language in the URL). If the user went to a specific language URL you should ignore the other settings.

      If he/she didn't go directly to a specific language, it's fair to assume he/she is in a non standard situation or is OK with the defaults, and applying heuristics doesn't help.

      • userbinator 12 years ago

        Would assuming that the default language is English be valid? I know a large percentage of the Internet probably doesn't "know English", but if they can connect to the Internet, would they at least recognise enough words (like "language"") that they can choose a different language?

        • hrktb 12 years ago

          I think it could be anything making sense (the default for a german company could be german for instance) as long as switching away is smooth and discoverable. People not familiar with the english alphabet for instance could be lost in the site, getting overwhelmed by the unknow information, even if they know the word "english" or "language". For people like in that case, the page could be in french it wouldn't make much difference.

          As a visual marker for language switching I imagined having a flag, but looking at the replies, that seems non optimal.

          The best behavior could be a popup shown only to users who's accepted languages don't match the current language, and keep the choice in a cookie perhaps ?

        • kiiski 12 years ago

          What makes you think that? If their computer came with an OS preinstalled, with Mongolian, for example, as the language, they would not have to ever see any English anywhere. A flag might be more universally recognized way of selecting the right language.

    • wisty 12 years ago

      > They infer your language from your IP address, and for most cases that IS the right thing to do.

      If IP and accept-language don't match, why not make a prominent button (in the language they didn't pick) to allow you to quickly change?

      • pixelcort 12 years ago

        This is the best option.

        "It looks like you are in Japan, but your computer's language is set to English. Which language would you prefer? [English][日本語]"

    • epsylon 12 years ago

      > and for most cases that IS the right thing to do.

      The problem is that for a large minority of people this is absolutely catastrophic. Think of the Western business traveller going to Japan or China...

      • bhrgunatha 12 years ago

        You're preaching to the choir, I'm in the suffering minority; that's is my problem exactly. I was just pre-empting the usual replies. Every time the subject comes up, people always respond with "Most people can't configure their browsers correctly" and "these websites do extensive testing and for most visitors they are right".

      • seszett 12 years ago

        Not just travelers. What content do you serve to people with an IP from Belgium? Switzerland? Canada?

  • ZoF 12 years ago

    His location is listed as California, so that wouldn't make much sense.

    The current top poster actually figured out the exact issue in this case.

junto 12 years ago

I have blogged about this several times. Google are one of the worst offenders. I'm not sure if it is insular non-travelled US developers with a deep love of IP-to-geolication databases, or an anally retentive legal department, but it really sucks as a user experience.

From an advertising perspective this is a major market that is being overlooked, because guess what, I don't look at ads that much, but you can bet your bottom advertising dollar that I'm definitely not going to read it in a language that isn't my mother tongue.

IP address != language preference

It is about time that developers got that through their thick skulls.

Finally, over here in Europe we can live in whichever EU country we want to. This means that we can move countries easily. I've already been in four of them. I don't think I'm an edge case by any means. People migrate.

midas007 12 years ago

I'm working on locale stuff for a Rails app right now (just updated the i18n_data in fact).

The assumption will be that country is mostly orthognal to language b/c people are übermobile. Further, that the dialect of the language should not force assumptions of other preferences... only autodiscover initial settings as close to desired as possible. (Fuck, why isn't there a standard for this common, hard-to-manage shit the OS already knows.)

i18n is taking up tons of time to get (mostly) right, but I believe it's one of those things not to botch because it's such a huge signal to everything else about your app.

If I want to be the most obscure hipster paying in Lesotho Loti, read Catalonian, have a "," for thousands separator and use UTC tz, by Flying Spaghetti Monster that's what it's gonna allow.

  • steveklabnik 12 years ago
    • midas007 12 years ago

      Interesting, thanks.

      Current Gemfile:

        # ...
      
        # i18n
        gem 'rails_locale_detection' # consider locale_setter
        gem 'rails-i18n', github: 'steakknife/rails-i18n'
        gem 'i18n_data', github: 'steakknife/i18n_data'
        gem 'countries_and_languages', require: 'countries_and_languages/rails'
        gem 'country_select' # for simple_form
      
        # tz
        gem 'tzinfo-data', '>= 1.2014.1'
        gem 'tzinfo'
      
        # symbols and images
        gem 'svg-flags-rails'
      
        # idn
        gem 'resolv-idn'   # resolv unicode patch
        gem 'idn-ruby'     # unicode IDNA domain resolution
        # ...
      
        # ...
pytrin 12 years ago

Those sites are not relying on accept-language, but rather on IP geolocation to select the default language. I sometimes use a non-US proxy when I'm feeling vigilant, and Google always uses the IP of the proxy to determine what language to serve me (even though my browser accept-language hasn't changed).

nemetroid 12 years ago

I'm running an English version of Windows 7 in Sweden. The accept-language headers in each of my installed browsers are:

* IE9: sv-SE

* Firefox: en,sv;q=0.5

* Chrome: en-US,en;q=0.8,sv;q=0.6

I'm going to go ahead and suggest that the reason English comes before Swedish is due to my system language, and that Swedish otherwise would come first. The "users will have the wrong settings" argument seems moot to me.

  • pornel 12 years ago

    I know people who have their OS in English even though they're not fluent in English - they can't be bothered to reinstall OS that came with the laptop (or cracked torrent) and understanding a few words like "ok/cancel" is enough.

gioele 12 years ago

To avoid such problems in Rack/Rails Ruby project I suggest the rack-i18n_best_langs gem (regardless of the name, it does not depend on the i18n gem) I wrote:

    https://github.com/gioele/rack-i18n_best_langs
> Differently from other similar Rack middleware components, rack-i18n_best_langs returns a list of languages in order of guessed importance, not a single language.

> Language discovery is done using three clues:

> * the presences of language tags in paths (e.g. /service/warranty/ita),

> * the content of the HTTP Accept-Language header,

> * the content of the rack.i18n_best_langs cookie when set.

Aldo_MX 12 years ago

Unrelated with the accept-language issue, but somehow related:

I needed to create a yahoo account, and I registered it selecting the kimo.com domain (kimo.com is a Chinese domain owned by yahoo). Since the first moment I set my language preferences to English.

No matter which yahoo service I'm visiting, I always get welcomed by at least the login prompt in Chinese, I can't really complain, because I was the one who looked for a rare domain, but it's an annoyance for me, because yahoo assumes that I understand Chinese because of the domain.

nraynaud 12 years ago

There is also the problem of getting the original content, I speak 3 langages, I intend to read the original source if it's one of those language. I don't want unpaid-intern translation.

MS C# documentation deserve a special kind of hell, because they detected I reverted the language to english ; and now they present me a special translation mode of their freaking doc where there are huge tooltip texts everywhere.

  • Too 12 years ago

    I don't know when, might be due to the language of your Windows installation, but sometimes in .NET they even translate exceptions and other error codes to your local language making it impossible to use google for troubleshooting.

    • nraynaud 12 years ago

      yeah, translated error messages are a pain to google. sometimes we can get away we error codes.

      (I'm using mono on a mac, and I'm not really doing important stuff).

pornel 12 years ago

Sorry, when I implement language negotiation I interpret "en-US" as lack of preference.

The problem is that en-US is the default and I can't tell difference between user not setting language and user choosing en-US.

Add "en" or even "en-GB" to your Accept-Language header.

  • Eiwatah4 12 years ago

    It isn't the default for most people. Download a browser and OS localized to German, French, or British English and Accept-Language defaults to that instead of "en-US".

petercooper 12 years ago

accept-language is like any of 1000 other idealistic parts of Internet specs that has good intentions but is so poorly used (or misused) that almost no-one implements it correctly, instead simply doing the simplest thing that works best for 99% of the audience.

raverbashing 12 years ago

If I access Google using and European IP it will show the Google page for the Country I am in, regardless of my accept-language

(I don't have an accept-language for Dutch, Italian, German, French but in all these cases I was shown the local page)

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection