This is an obsolete specification
Assignee:
Created: 2005-07-26 by DafyddHarries
Status: BrazilTopic, DraftSpecification, InfrastructureSpecification
Contributors: MatthewPaulThomas
Queues:
Implementation branch: none yet
Malone bugs:
Introduction
Launchpad will be internationalised so that it can be translated into different languages.
Rationale
Users want to be able to translate Launchpad so that it can be used in their language.
Aspects of the problem
Internationalising code. This will be done using Zope's i18n framework.
Internationalising page templates.
Language preference UI. This is what users will use to tell Launchpad which language to display messages in. This is distinct from a user's preference about which language a user translates into. (A parallel is Google's preferences page, which allows the user to select an interface language and a number of search languages.)
Language preference storage. Users' language preferences need to be stored in the database.
Applying language preferences. We need to apply the user's language preference to the code and page template i18n systems.
Assumptions
- Rendering dates and times in the ISO format is fine for everybody. Localizing display of dates and times by updating the fmt: TALES namespace, if desired, should be done as a seperate specification.
Use cases
"I know that translators should know English, but they know their language better, and feel better (feeling something like being home) ... When I'm working with some tool, it should be in my language." -- Hossein Noorikhah
Design
- The top left corner of every Launchpad page will contain a flags icon to visually indicated language selection, and (if you are not logged in) an option menu for changing the selected language.
When not logged in, changing the selection will set the persistent cookie launchpad_language, and reload the page in the selected language. Dataloss isn't an issue here, because you can enter very little data when not logged in.
- When logged in, the selection menu will be on your preferences page instead. In this case, both the cookie and the users Person.language will be changed. Setting the cookie ensures that the correct language is used when revisiting launchpad after their login session has expired.
Implementation
Schema Changes
None. The Person table already has a language column for this.
Data Migration
Code Changes
Default Language
Launchpad will select the display language by:
If the user is logged in and they have a preferred language set, it is used. Also, the launchpad_language cookie is set if it differs from the Person.language.
If the launchpad_language cookie is set, it is used.
- The browser's preferred language is used, as per the relevant HTTP headers.
The launchpad_language cookie should never expire. Other parts of the UI may also set this cookie, such as clicking on a flag.
Page template default domain
All of our page templates will be in a single translation domain. To avoid having to include the i18n:domain attribute inside every page, we need to configure it to use the Launchpad domain by default.
To do this, we will add a new command line option to sourcecode/zope/utilities/i18nextract.py that explicity toggles the inclusion of the default domain. This change is trivial. This change should be fed upstream.
Directory structures
Create a directory 'locales' at relevant points. ie. canonical/launchpad/locales. Inside this directory we need to create launchpad.pot. As we add translations, we also want to include a directory for each language, such as en. Within the en directory we have an LC_MESSAGES directory containing launchpad.po. The .pot file will be extracted from our source code. The .po file will be the translated output from Rosetta, except in the case of English. We need to create locales/en/LC_MESSAGES/launcpad.po with some boilerplate because the default language for launchpad is English.
# This file contains no message ids because launchpad's default # language is English msgid "" msgstr "" "Project-Id-Version: launchpad\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n"
Existing Python Code
Python code needs access to a zope.i18n.messageid.MessageIDFactory to internationalize strings. Rather than construct this in each module, we should import them from common locations for clarity and to avoid errors.
Python modules inside canonical/launchpad should contain the following import:
1 from canonical.launchpad import _
Other packages inside canonical may contain their own MessageIDFactory when appropriate. This should be discussed.
Strings that should be translated then need to be called as an argument to the '_' function. ie.
1 a = 'hello'
becomes:
1 a = _('hello')
Any interpolation in strings should be done by writing placeholders as $name and providing the values to be interpolated as a dict to the mapping parameter. For example:
1 a = 'My interpolated values: 1 = %s, 2 = %s' % ('first', 'second')
becomes:
1 a = _('My interpolated values: 1 = ${value1}, 2 = ${value2}', mapping={'value1' : 'first', 'value2' : 'second'})
ZCML
The <configure> tag needs to have an i18n_domain attribute added. This will most likely only need to be done occasionally, as ZCML files should inherit the value from the ZCML document that included them. eg.
<configure xmlns="http://namespaces.zope.org/zope" i18n_domain="launchpad"> [...] </configure>
In the file lib/canonical/launchpad/configure.zcml, we need to register the message catalogs:
<configure xmlns="http://namespaces.zope.org/zope" xmlns:i18n="http://namespaces.zope.org/i18n"> [...] <i18n:registerTranslations directory="locales" /> [...] </configure>
The filename of the message cataglog in the locales directory must match the domain name (as per gettext).
Test Suite
The test runner needs to be updated to run the tests in a specific locale.
UI Changes
We will be using ZPT's default domain for our Launchpad templates. This means that i18ning ZPT pages is simply a case of changing:
<div title="Some title">Some text</div>
to:
<div title="Some title" i18n:attributes="title" i18n:translate="">Some text</div>
or if you wish to explicitly list your message ids:
<div title="Some title" i18n:attributes="title some-title" i18n:translate="some-text">Some text</div>
Note that you can translate multiple attributes by seperating the i18n:attributes items with a semi colon:
<img alt="foo" title="bar" src="baz.gif" i18n:attributes="alt; title title-msg-id" />
More information about page template internationalization can be found at: http://dev.zope.org/Zope3/ZPTInternationalizationSupport
For a real life example, here is a fragment of the Launchpad homepage showing it before and after the i18n markup has been added:
<p>Rosetta is a <b>web-based translation portal</b>. If your application uses the open source standard <i>gettext</i> system for software internationalisation, then you can easily have users contribute translations for your software through Rosetta. You simply upload the existing translations, point your community at Rosetta and then download their translations before you make your release. <a title="Rosetta" href="rosetta">Translate now!</a> </p>
<p i18n:translate="rosetta-blurb">Rosetta is a <b>web-based translation portal</b>. If your application uses the open source standard <i>gettext</i> system for software internationalisation, then you can easily have users contribute translations for your software through Rosetta. You simply upload the existing translations, point your community at Rosetta and then download their translations before you make your release. <a i18n:attributes="title" title="Rosetta" href="rosetta" i18n:translate="">Translate now!</a></p>
Note that in this example, some of the nested inline tags have been left as part of the rosetta-blurb string, whilst the final anchor tag has been broken out into a seperate translatable string.
Extracting translatable strings and translating
For Launchpad, the string extractor process will be run using:
% make potemplates
This will run the following command:
python sourcecode/zope/utilities/i18nextract.py \ -d launchpad -p lib/canonical/launchpad \ -o locales
This will store a generated file lib/canoncal/launchpad/locales/launchpad.pot
This file needs to be added to revision control.
This generated .pot file will be uploaded to the production Launchpad server for translation using Rosetta. Ideally, this will be done automatically by PQM on commit.
On a regular basis, and when tagging production releases, the .po files need to be downloaded from Rosetta. scripts/merge-launchpad-pofiles.py will be written to extract the Rosetta output into the correct location and baz add them if they are not already under revision control.
Discussion
Internationalizing Launchpad will affect our rollout procedures. All but the most trivial changes will change the english text and require translators to update their translations. If we fail to get the translations done before rollout, our internationalization efforts will look cheap and dodgy - hardly a ringing endorsement for Launchpad, Ubuntu or Canonical. This means we will need to move to a much stricter freeze/staging/rollout model, as discussed in WeeklyPlannedRollouts. However, we may find we need up to a month of code freeze to get the translations in. Also, unless we employ the translators, we are relying on community generosity to keep Launchpad internationalized. This means that we will need to be careful about what languages we enable Launchpad to run in, as we will need to ensure that Launchpad has not only been translated, but that there is someone interested enough in maintaining the translation on a regular basis.
Unresolved Issues
- What locale should the tests be run in?
- We should be consistant if we use explicit message ids or not.
What strings should not be translated? Launchpad? Soyuz? Malone? Registry? We will need a page on the wiki detailing policies for various branding issues.
- When should a translatable string be given an explicit message id?
Should these policy decisions/style guidelines be documented in this spec or on another wiki page such as LaunchpadI18nStyleGuide ?
The standard ZPT i18n markup is very verbose. Perhaps we should fix it so the strings are shorter. (eg. change i18n:translate to i:t and i18n:attributes to i:a).
What should be do with the xml:lang="en" and lang="en" attributes that are in some of our page templates? Should these be deleted, or are they magically updated?
- Initial tests show the i18nextract.py script is buggy (the current launchpad.pot has unparsable bits extracted from the page templates). We need to see if this is fixed in later versions of Z3. If not, we will either need to fix it ourselves or ensure our documented best practice for marking up translatable strings does not trigger the bugs.
Questions and Answers
Reviewer comments
- This spec needs notes about which areas of text get translated and which do not, and how we make this work properly. For example, we will want the login page to be translated. We will want the 'what is this about?' portlet title and contents to be translated. We will not want most schema titles and descriptions to be translated. So, the spec should have a section listing in priority order the things that should be made translatable, and a listing of those things that we do not want translated at all.