Settings

Theme

FormatJS – Internationalize your web apps on the client and server

formatjs.io

181 points by juandopazo 11 years ago · 38 comments

Reader

cj 11 years ago

Shameless plug: I've been working on a similar project for about a year now called Localize.js (https://localizejs.com), with the goal of automating the entire process of internationalization / localization. It works by scanning the DOM for translatable content, and injecting translations on-the-fly after the page loads (this happens so quickly that the user never sees the text in the original language).

  • nawitus 11 years ago

    >It works by scanning the DOM for translatable content, and injecting translations on-the-fly after the page loads

    I don't like this approach because it makes the framework/library harder to integrate with other DOM-modifying frameworks like data binding frameworks that are very popular these days. A more modular approach in my opinion would be to simply provide a function/functions to do the localization conversions. That can easily be integrated to any data binding framework.

    >(this happens so quickly that the user never sees the text in the original language)

    And if you use it with something like AngularJS, the end result is visual flicker after DOM changes..

    • cj 11 years ago

      Localize.js is fully compatible with all DOM-modifying frameworks (Backbone, Angular, etc). Localize.js doesn't actually replace existing DOM elements, rather it simply changes the existing elements' contents as to not interfere with bindings.

      We've also spent a ton of time making sure there's zero visual flicker as the DOM changes take place. We have a bunch of companies using it to translate Angular and Backbone apps (for example, http://venzee.com/ and https://www.verbling.com).

      • nawitus 11 years ago

        But when the contents change, how does Localize.js know to translate the DOM again? If you need to call something like 'Localize.translatePage', then you need to track the changes yourself, which is not the correct way. I noticed that one can bypass the DOM modifying by calling Localize.translate to directly translate text, which is what I would do with Angular.JS. I'd just write a simple directive which uses the Localize.js function to translate text.

        • cj 11 years ago

          We use MutationObserver (https://developer.mozilla.org/en-US/docs/Web/API/MutationObs...) to detect when content on the page changes, so when you add (or change) a <div> on your page, we're able to immediately translate the new content, on-the-fly, as it's being inserted into the DOM.

          Our goal with Localize.js is to make everything completely "plug and play", and require as little extra development work as possible. We're hoping this will make localization more accessible to startups / companies who don't have weeks or months to spend manually internationalizing their application.

          • schrodinger 11 years ago

            At first I was turned off by your approach, but the more I hear your points and think about it the more I like it.

            I wouldn't use it for greenfield development, but it's definitely a great option for websites that wouldn't otherwise be localized. It's great for people to have a near-zero-development option to translate existing sites.

            I'd certainly rather have websites translated using this approach than not translated at all!

          • uptown 11 years ago

            Could implementing things this way cause a problem if another script also post-processed your pages? Could you wind up in a loop where each script kept modifying the page in response to the changes made by the other script?

            • cj 11 years ago

              Technically, it's possible. To get stuck in a loop you'd have to have another script that uses MutationObserver to reverse the changes that Localize.js makes to the DOM. We haven't run in to this yet, but I'll see if there's a way to safeguard against this.

          • ttty 11 years ago

            Please stop.. Don't do things this way. This is like jquery monkey patch hell..

mikewhy 11 years ago

Are there any advantages over i18n-js[1]? Can't say I'm a huge fan of this method of pluralization:

    Cart: {itemCount} {itemCount, plural,
        one {item}
        other {items}
    }
[1]: https://github.com/fnando/i18n-js
  • drewfish 11 years ago

    Yeah, for a simple plural that can be a bit longer. In other languages, though, the pluralization rules get rather complicated[1]. (For example, Arabic has both complicated pluralization rules -and- a lot of people who speak it.)

    The strength of the ICU message format, in my mind, is that the messages can be "nested" so that the translation can be customized for multiple concerns (plural, gender, whatever).

    Also, with the integrations (dust, handlebars, react) the details of translation and display of data lives in the message format and/or template. This is the "view layer", and means that your controller/code isn't littered with a bunch of calls to a translation library.

    [1] http://unicode.org/repos/cldr-tmp/trunk/diff/supplemental/la...

    • mikewhy 11 years ago

      Nice, I didn't know that about Arabic, and the many other languages.

      Though i18n-js does let you write your own pluralizations rules (taken from the readme), while supporting zero/one/many out of the box:

          I18n.pluralization["ru"] = function (count) {
            var key = count % 10 == 1 && count % 100 != 11 ? "one" : [2, 3, 4].indexOf(count % 10) >= 0 && [12, 13, 14].indexOf(count % 100) < 0 ? "few" : count % 10 == 0 || [5, 6, 7, 8, 9].indexOf(count % 10) >= 0 || [11, 12, 13, 14].indexOf(count % 100) >= 0 ? "many" : "other";
            return [key];
          };
      
      I've posted an example below, but I don't consider `@div null, @t('welcomeMessage', { username })` "littering" my code.
      • drewfish 11 years ago

        There are a few NPM libraries (make-plural, cldr, probably others) which will help you write those pluralization functions. The CLDR data does get updated from time to time, so it's nice to rely on another package to trace those changes.

  • caridy 11 years ago

    I haven't look into i18n-js library in details, but this is what I can spot so far:

    * the message format in i18n-js seems to be compatible with ICU message syntax, the industry standard used in other programming languages and the one used by formatJS as well. we will have to check if they really implemented all the specs, which makes the messages more advanced, e.g.:

    ``` Cart: {itemCount, plural, =0 {no items} one {one item} other {# items} } ```

    including the fact that itemCount from `other` option will be formatted as a number, saying "1,030" in EN, vs "1 030" in FR.

    * i18n-js is a js library, which means you have to do the formatting in your js code, then passing the formatted data into the template engine where you have the placeholders for them, while FormatJS focuses more on the high-level declarative form that you can use in your templates directly, which makes things simpler, if you use handlebars, you could do: {{formatMessage "Cart" itemCount=numItems}} right in your template.

    • mikewhy 11 years ago

      > i18n-js is a js library, which means you have to do the formatting in your js code, then passing the formatted data into the template engine where you have the placeholders for them

      Not sure I follow, taken from a React component:

          Component = React.createClass                                                
            render: ->                                                                
              @div null,
                @t('welcomeMessage', username: @props.CurrentUser.displayName)
      • ougawouga 11 years ago

        I have nothing against Coffeescript, and use it myself sometimes. But please don't respond to something about "js" and then put down a single coffeescript snippet with no additional context. It's very confusing.

        I had to re-read the last line several times before I looked up at the function arrow and realized it was coffeescript, and that the "@" were not part of the i18n library, but rather the syntactic sugar for "this".

MaBu 11 years ago

How does it compare to normal "Gettext workflow"?

Currently I use i18next with custom functions, where I just write strings like _("car"), ngettext("window", "windows", number) in files. I Can use translator comments, context and everything. (Like //TRANSLATORS: this is used in this way etc.) https://www.gnu.org/software/gettext/manual/html_node/PO-Fil... Babel extracts all translatable strings to POT file. I Translate POT file to my wanted languages PO files with GUI tool of my choice. Then PO file is converted to json and used in translations with i18next library. When New translations are added I just rerun Babel new translations are added, are retranslated if needed and converted to JSON. I looked into a lot of JS libraries and extractors and these was the only one that supported Plurals, context, translator comments, etc.

I looked into Mozilla's L20 which seems nice. But there is no GUI translation editor. You have to find all translatable strings yourself etc. End it seems it's the same here.

One better things is that with FormatJS I wouldn't need moment.js for date localization.

  • caridy 11 years ago

    The workflow is, for now, out of the scope of this project, we assume developers will figure how to produce a javascript object that contains key=value pairs, where each value is a message written in ICU message syntax, and where values are feed into the template engine for helpers/methods to use them.

    Internally at Yahoo (just like facebook, and other big companies), we have an infrastructure for translation that works based on a source file written in english by developers, and the whole thing just work. But we have no plans to open up any of that. We believe, such system will grow from the community once people realize that ICU is good enough to internationalize their apps.

    As for moment.js, you're right, if you will never need to parse a date, or massage a date value, and the only thing you care about is to format a timestamp that is coming from an API, then `formatRelative` helper should be good enough.

gear54rus 11 years ago

A related thought: I wonder if we will some day live in a world where translation is not required. Where everyone knows English and has no trouble using English tools and consuming English content.

Same goes for measurement systems (metric), time, currency, and other formats. I reckon it would simplify our lives greatly and spare us the trouble of dealing with 1000s of encodings, multi-byte strings and different text directions.

Technological landscape is the only place where such unity between nations is possible, I tend to think that this is what should be pursued instead of translate-everything-everywhere approach.

  • sambeau 11 years ago

    What makes you think it won't be Chinese?

    • gear54rus 11 years ago

      The fact that English is easier to learn :)

      • Ralfp 11 years ago

        It's only "fact" because you are accustomed to latin alphabet and you are native speaker of language that was either latin-influenced, shares root with English, or you were taught it since you were child.

        The last one actually is sometimes called "little chinese boy" in circles of language tutors due to how young children can easily pick up languages different from their native one due to lack of bias. My cousin thats 5 years old now is already dual-languaged due to her parents exposing her to their native languages at all times.

        • gear54rus 11 years ago

          My original point also has a lot to do with single-byte encoding(s), as you may have noticed. Will we really want to change ASCII to encode something different than it does now? Will we throw away all our programming languages as well? I don't think so.

      • kiiski 11 years ago

        I doubt ease of learning really has anything to do with the current popularity of English. As soon as the USA stops being the dominant power in the world, English will start losing ground as well.

        edit: Looking at the EF English Proficiency Index[1], English doesn't seem quite as universal as it feels like in the Anglosphere anyway.

        [1]: http://en.wikipedia.org/wiki/EF_English_Proficiency_Index

caridy 11 years ago

The official announcement is now live:

http://yahooeng.tumblr.com/post/100006468771/announcing-form...

hopefully it will help to clarify few of the questions...

  • BruceM 11 years ago

    Any thoughts on how hard this might be to integrate with Polymer (or web components in general)?

alexchamberlain 11 years ago

> 10/14/2014 English (US) > 14/10/2014 French

Pedantic Englishman here: 14/10/2014 is used all over Europe, including England. If only we could persuade the world to use an international format: 2014-10-14, for example.

  • pimlottc 11 years ago

    I like using "14 Oct 2014" to avoid any ambiguity. 2014-10-14 is good too but it comes off a bit technical, like dropping scientific notation in an everyday message.

nawitus 11 years ago

I'm currently using Moment.js, i18next and Numeral.js with AngularJS. I wonder how FormatJS compares with this. At least one benefit FormatJS could have is having a unified collection of translation files.

  • caridy 11 years ago

    the main benefit of formatjs is that it offers a declarative syntax at the template level, which simplifies things drastically. we don't have an integration for AngularJS or Ember just yet, but we are planning to do so very soon.

macca321 11 years ago

I would be interested to know if it offers anything over Globalize

https://github.com/jquery/globalize

  • drewfish 11 years ago

    Hmmm... after very quickly looking at Globalize, I'd say there are two things about formatjs.io that I see as main differences:

    * Integrations with Handlebars, Dust, and React hopefully make formatjs.io easy to use (since people are already using one of these).

    * Focus on the ICU message format, which is fairly simple yet fairly expressive. (Professional translators should hopefully be familiar with this syntax, and it's actually fairly straightforward for us engineers to use.)

    One thing that looks interesting (to me) about Globalize is the way the latest/freshest CLDR data is loaded.

mrmch 11 years ago

Super shameless plug, but easy internationalization for email is something we've added to Sendwithus. We're working with multiple partners on it (sample at https://www.sendwithus.com/translations), but are still looking for more beta users of the feature.

grakic 11 years ago

This may be offtopic, but is there an editor for translating ICU messageformat strings?

There are many tools to extract gettext strings from source and editors to translate PO catalogs, but I do not know of any that works with ICU messageformat syntax.

joshdance 11 years ago

Looks great. One thing that wasn't clear, can you provide your own translations or is it all machine translation?

  • juandopazoOP 11 years ago

    Hi! There are helpers for dealing with dynamic content like numbers and dates, but also there's message formatting for when you have your own translations like "I have {numCats, number} cats.". Here's the explanation for the Handlebars integration: http://formatjs.io/guide/#messageformat-syntax.

  • nols 11 years ago

    I don't think it translates any content, it just internationalizes the code so you can utilize separate translations (and numbering format, etc).

  • drewfish 11 years ago

    You have to provide your own translations.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection