Translations/EngineeringOverview

Not logged in - Log In / Register

Revision 1 as of 2011-10-17 06:43:07

Clear message

Engineering Overview: Translations

This is for engineers who are already familiar with the Launchpad codebase, but are going to work on the Translations subsystem.

Update this page. If you find any part of this missing or out of date, fix it or get someone to fix it!

Use cases

The purpose of Launchpad Translations is to translate programs' user-interface strings into users' natural languages. To that end it supports online translation, offline translation, uploads of translation files from elsewhere, generation and download of translation files, import from bzr branches, export to bzr branches, exports of language packs, and so on. Something we're not very good at yet is helping users bring Launchpad translations back upstream.

We've got two major uses for Translations:

  1. Ubuntu and derived distributions.
  2. Launchpad-registered projects.

Sometimes we refer to these as the two “sides” of translation in Launchpad: the ubuntu side and the upstream side.

Where possible, the two sides are unified (in technical terms) and integrated (in collaboration terms). But you'll see a lot of cases where they are treated somewhat differently. Permissions can differ, organizational structures differ, and some processes only exist on one side or the other.

At the most fundamental level, the two sides are integrated through:

Ubuntu side

In a distribution, translation happens in the context of a source package. That is, a given SourcePackageName in a given DistroSeries.

Translations sharing happens within a source package, between different distribution release series.

Most translations come in from upstream (Debian, Gnome), but we have a sizable community of users completing and updating these translations in Launchpad.

Ubuntu has a team of translations coordinators in charge of this process.

Projects side

In a project, translation happens in the context of a project release series. That is, a ProductSeries.

Translations sharing happens between the release series of a single project.

Project groups also play a small role in permissions management, but we otherwise pretend they don't exist.

Structure and terminology

Essentially all translations in Launchpad are based on gettext. Software authors mark strings in their codebase as translatable; they then use the gettext tools to extract these and get them into Launchpad in one of several ways. We also call the translatable strings messages.

The top-level grouping of translations is a template. A ProductSeries or SourcePackage can contain any number of templates; typically it needs only one or two for the main program, a main library that the program is built around, and so on; on the other hand some projects create a template for each module.

Because of our gettext heritage, we also refer to these templates as “POTs,” “PO templates,” or “pot files.”

In python terms, think:

productseries.potemplates = [potemplate1]
potemplate1.productseries = productseries

sourcepackage.potemplates = [potemplate2]
potemplate2.sourcepackage = sourcepackage

Each template can be translated to one or more languages. Again because of our gettext heritage, translation of a template into a language is referred to as a PO file. A PO file is not just a shapeless bag of translated messages; it specifically translates the messages currently found in its template.

In python terms:

potemplate.pofiles = {
    language: pofile,
    }

pofile.language = language
pofile.potemplate = potemplate

(A gettext PO file is pretty much the same as a template file. A bit of metadata aside, the big difference is that a template leaves the translations blank.)

The currently translatable messages in a template (“pot message sets”) are kept in a numbered sequence. This sequence defines which messages need to be translated in the PO files. Messages that are no longer in the template are obsolete; we may still track them but they are no longer an active part of the template.

In python terms, think:

potemplate.potmsgsets = [potmsgset1]
potemplate.obsolete_potmsgsets = set([potmsgset2])

Think of a translated string in a PO file as a translation message. This gets a bit more complicated once you start looking at the database schema, but from the perspective of a PO file it's accurate.

translation_message1.potmsgset = potmsgset1
translation_message1.language = pofile.language

A translation message can be current in a given PO file, or not. It's an emergent property of more complex shared data structures. So you can view a PO file as a customizable “view” on the current translations of a particular template into a given language.

pofile.current_translation_messages = {
    potmsgset1: translation_message1,
    }

Often a translation message translates a message from a PO file's template into the PO file's language, but is not current (from the perspective of that PO file). In that case we consider it a suggestion. We make it easy for users with the right privileges to select suggestions to become current translations.

pofile.suggestions = {
    potmsgset2: [translation_message2],
    }

Workflow

You guessed it: this still needs writing. This is a wiki page, so it's permanently Under Construction.

Suggestions and translations

Permissions and organization

Objects and schema

Processes

Import queue

Gardener

Export queue

Bazaar imports

Bazaar exports

Template generation