Invenio NATIVE LANGUAGE SUPPORT
===============================

About
-----

This document describes the Native Language Support (NLS) in Invenio.

Native Language Support information for administrators
------------------------------------------------------

Invenio is currently available in the following languages::

   af = Afrikaans
   ar = Arabic
   bg = Bulgarian
   ca = Catalan
   cs = Czech
   de = German
   el = Greek
   en = English
   es = Spanish
   fa = Persian (Farsi)
   fr = French
   gl = Galician
   hr = Croatian
   hu = Hungarian
   it = Italian
   ja = Japanese
   ka = Georgian
   lt = Lithuanian
   no = Norwegian (Bokmål)
   pl = Polish
   pt = Portuguese
   ro = Romanian
   ru = Russian
   rw = Kinyarwanda
   sk = Slovak
   sv = Swedish
   uk = Ukrainian
   zh_CN = Chinese (China)
   zh_TW = Chinese (Taiwan)

If you are installing Invenio and you want to enable/disable some
languages, please just follow the standard installation procedure as
described in the INSTALL file.  The default language of the
installation as well as the list of all user-seen languages can be
selected in the general invenio.conf file, see variables CFG_SITE_LANG
and CFG_SITE_LANGS.

(Please note that some runtime Invenio daemons -- such as webcoll,
responsible for updating the collection cache, running every hour or
so -- may work twice as long when twice as many user-seen languages
are selected, because it creates collection cache page elements for
every user-seen language.  Therefore, if you have defined thousands of
collections and if you find the webcoll speed to be slow in your
setup, you may want to try to limit the list of selected languages.)

Native Language Support information for translators
---------------------------------------------------

If you want to contibute a translation to Invenio, then please follow
the procedure below:

 - Please check out the existence of po/LL.po file for your language,
    where LL stands for the ISO 639 language code (e.g. `el` for
    Greek).  If such a file exists, then this language is already
    supported, in which case you may want to review the existing
    translation (see below).  If the file does not exist yet, then you
    can create an empty one by copying the invenio.pot template file
    into LL.po that you can review as described in the next item.
    (Please note that you would have to translate some dynamic
    elements that are currently not located in the PO file, see the
    appendix A below.)

  - Please edit LL.po to review existing translation.  The PO file
    format is a standard GNU gettext one and so you can take advantage
    of dedicated editing modes of programs such as GNU Emacs, KBabel,
    or poEdit to edit it.  Pay special attention to strings marked as
    fuzzy and untranslated.  (E.g. in the Emacs PO mode, press `f` and
    `u` to find them.)  Do not forget to remove fuzzy marks for
    reviewed translations.  (E.g. in the Emacs PO mode, press `TAB` to
    remove fuzzy status of a string.)

  - After you are done with translations, please validate your file to
    make sure it does not contain formatting errors.  (E.g. in the
    Emacs PO mode, press `V` to validate the file.)

  - If you have access to a test installation of Invenio, you may want
    to see your modified PO file in action::

       $ cd po
       $ emacs ja.po                      # edit Japanese translation
       $ make update-gmo
       $ make install
       $ sudo apachectl restart
       $ firefox http://your.site/?ln=ja  # check it out in context

    If you do not have access to a test installation, please
    contribute your PO file to the developers team (see the next step)
    and we shall install it on a test site and contact you so that you
    will be able to check your translation in the global context of
    the application.

    (Note to developers: note that ``make update-gmo`` command may be
    necessary to run before ``make`` if the latter fails, even if you
    are not touching translation business at all.  The reason being
    that the gmo files are not stored in CVS, while they are included
    in the distribution tarball.  So, if you are building from CVS,
    and you do not have them in your tree, you may get build errors in
    directories like modules/webhelp/web/admin saying things like "No
    rule to make target `index.bg.html`".  The solution is to run
    ``make update-gmo`` to produce the gmo files before running
    ``make``.  End of note to developers.)

  - Please contribute your translation by emailing the file to
    <info@invenio-software.org>.  You help is greatly appreciated and
    will be properly credited in the THANKS file.

See also the GNU gettext manual, especially the chapters 5, 6 and 11.
<http://www.gnu.org/software/gettext/manual/html_chapter/gettext_toc.html>

Native Language Support information for programmers
---------------------------------------------------

Invenio uses standard GNU gettext I18N and L12N philosophy.

In Python programs, all output strings should be made translatable via
the _() convention::

   from messages import gettext_set_language
   [...]
   def square(x, ln=CFG_SITE_LANG):
       _ = gettext_set_language(ln)
       print _("Hello there!")
       print _("The square of %s is %s.") % (x, x*x)

In webdoc source files, the convention is _()_::

   _(Search Help)_

Here are some tips for writing easily translatable output messages:

  - Do not cut big phrases into several pieces, the meaning may be
    harder to grasp and to render properly in another language.  Leave
    them in the context.  Do not try to economize and reuse
    standalone-translated words as parts of bigger sentences.  The
    translation could differ due to gender, for example.  Rather
    define two sentences instead::

       not: _("This %s is not available.") % x,
            where x is either _("basket") or _("alert")

       but: _("This basket is not available.") and
            _("This alert is not available.")

  - If you print some value in a translatable phrase, you can use an
    unnamed %i or %s string replacement placeholders::

       yes: _("There are %i baskets.") % nb_baskets

    But, as soon as you are printing more than one value, you should
    use named string placeholders because in some languages the parts
    of the sentence may be reversed when translated::

       not: _("There are %i baskets shared by %i groups.") % \
               (nb_baskets, nb_groups)
       but: _("There are %(x_nb_baskets)s baskets shared by %(x_nb_groups)s groups.") % \
               {'x_nb_baskets': nb_baskets, 'x_nb_groups': nb_groups,}

    Please use the `x_` prefix for the named placeholder variables to
    ease the localization task of the translator.

  - Do not mix HTML presentation inside phrases. If you want to
    reserve space for HTML markup, please use generic replacement
    placeholders as prologue and epilogue::

       not: _("This is <b>cold</b>.")
       but: _("This is %(x_fmt_open)scold%(x_fmt_close)s.")

    Ditto for links::

       not: _("This is <a href="%s">homepage</a>.")
       not: _("This is %(x_url_open)shomepage%(x_url_close)s.")

  - Do not leave unnecessary things in short commonly used
    translatable expressions, such as extraneous spaces or colons
    before or after them.  Rather put them in the business logic::

       not: _(" subject")
       but: " " + _("subject")

       not: _("Record %i:")
       but: _("Record") + "%i:" % recID

    On the other hand, in long sentences when the trailing punctuation
    has its meaning as an integral part of the label to be shown on
    the interface, you should leave them::

       not: _("Nearest terms in any collection are")
       but: _("Nearest terms in any collection are:")

  - Last but not least: the best is to follow the style of existing
    messages as a model, so that the translators are presented with a
    homogeneous and consistently presented output phrase set.

Introducing a new language
--------------------------

If you are introducing a new language for the first time, then please
firstly create and edit the PO file as described above in Section 2.
This will make the largest portion of the translating work done, but
it is not fully enough, because we currently have also to translate
some dynamic elements that aren't located in PO files.

The development team can edit the respective files ourself, if the
translator sends over the following translations by email:

   - demo server name, from invenio.conf::

        Atlantis Institute of Fictive Science

   - demo collection names, from democfgdata.sql::

        Preprints
        Books
        Theses
        Reports
        Articles
        Pictures
        CERN Divisions
        CERN Experiments
        Theoretical Physics (TH)
        Experimental Physics (EP)
        Articles & Preprints
        Books & Reports
        Multimedia & Arts
        Poetry
        Atlantis Times News
        Atlantis Times Arts
        Atlantis Times Science
        Atlantis Times
        Atlantis Institute Books
        Atlantis Institute Articles
        Atlantis Times Drafts
        Notes
        ALEPH Papers
        ALEPH Internal Notes
        ALEPH Theses
        ISOLDE Papers
        ISOLDE Internal Notes
        Drafts
        Videos
        Authorities
        People
        Institutes
        Journals
        Subjects

   - demo right-hand-side portalbox, from democfgdata.sql::

        ABOUT THIS SITE
        Welcome to the demo site of the Invenio, a free document server
        software coming from CERN. Please feel free to explore all the
        features of this demo site to the full.
        SEE ALSO

The development team will than edit various files (po/LINGUAS, config
files, sql files, plenty of Makefile files, etc) as needed.

The last phase of the initial introduction of the new language would
be to translate some short static HTML pages such as:

   - modules/webhelp/web/help-central.webdoc

Thanks for helping us to internationalize Invenio.

Integrating translation contributions
-------------------------------------

This appendix contains some tips on integrating translated phrases
that were prepared for different Invenio releases.  It is mostly
of interest to Invenio developers or the release manager.

Imagine that we have a working translation file sk.po and that we have
received a contribution co-CONTRIB.po that was prepared for previous
Invenio release, so that the messages do not fully correspond.
Moreover, another person might have had worked with the sk.po file in
the meantime.  The goal is to integrate the contributions.

Firstly, check whether the contributed file sk-CONTRIB.po was indeed
prepared for different software release version::

   $ msgcmp --use-fuzzy --use-untranslated sk-CONTRIB.po invenio.pot

If yes, then join its translations with the ones in the latest sk.po file::

   $ msgcat sk-CONTRIB.po sk.po > sk-TMP.po

and update the message references::

   $ msgmerge sk-TMP.po invenio.pot > sk-NEW.po

This will give the new file sk-NEW.po that should now be msgcmp'rable
to invenio.pot.

Lastly, we will have to go manually through sk-NEW.po in order to
resolve potential translation conflicts (marked via ``#-#-#-#-#``
fuzzy translations).  If the conflicts are evident and easy to
resolve, for example corrected typos, we can fix them.  If the
conflicts are of translational nature and cannot be resolved without
consulting the translators, we should warn them about the conflicts.
After the evident conflicts are resolved and the file validates okay,
we can rename it to sk.po and we are done.

(Note that we could have used ``--use-first`` option to msgcat if we
were fully sure that the first translation file (sk-CONTRIB) could
have been preferred as far as the quality of translation goes.)
