Engineering Overview: Translations
This is for engineers who are already familiar with the Launchpad codebase, but are going to work on the Translations subsystem.
Update this page. If you find any part of this missing or out of date, fix it or get someone to fix it!
Use cases
The purpose of Launchpad Translations is to translate programs' user-interface strings into users' natural languages. To that end it supports online translation, offline translation, uploads of translation files from elsewhere, generation and download of translation files, import from bzr branches, export to bzr branches, exports of language packs, and so on. Something we're not very good at yet is helping users bring Launchpad translations back upstream.
We've got two major uses for Translations:
- Ubuntu and derived distributions.
- Launchpad-registered projects.
Sometimes we refer to these as the two “sides” of translation in Launchpad: the ubuntu side and the upstream side.
Where possible, the two sides are unified (in technical terms) and integrated (in collaboration terms). But you'll see a lot of cases where they are treated somewhat differently. Permissions can differ, organizational structures differ, and some processes only exist on one side or the other.
At the most fundamental level, the two sides are integrated through:
global suggestions — "here's a translation that was used / suggested elsewhere for this same string"
translations sharing — individual translations of the same string can be shared in multiple places. This is a complex and multi-layered affair that you'll see coming back later in this document.
Ubuntu side
In a distribution, translation happens in the context of a source package. That is, a given SourcePackageName in a given DistroSeries.
Translations sharing happens within a source package, between different distribution release series.
Most translations come in from upstream (Debian, Gnome), but we have a sizable community of users completing and updating these translations in Launchpad.
Ubuntu has a team of translations coordinators in charge of this process.
Projects side
In a project, translation happens in the context of a project release series. That is, a ProductSeries.
Translations sharing happens between the release series of a single project.
Project groups also play a small role in permissions management, but we otherwise pretend they don't exist.
Structure and terminology
Essentially all translations in Launchpad are based on gettext. Software authors mark strings in their codebase as translatable; they then use the gettext tools to extract these and get them into Launchpad in one of several ways. We also call the translatable strings messages.
The top-level grouping of translations is a template. A ProductSeries or SourcePackage can contain any number of templates; typically it needs only one or two for the main program, a main library that the program is built around, and so on; on the other hand some projects create a template for each module.
Because of our gettext heritage, we also refer to these templates as “POTs,” “PO templates,” or “pot files.”
In python terms, think:
productseries.potemplates = [potemplate1] potemplate1.productseries = productseries sourcepackage.potemplates = [potemplate2] potemplate2.sourcepackage = sourcepackage
Each template can be translated to one or more languages. Again because of our gettext heritage, translation of a template into a language is referred to as a PO file. A PO file is not just a shapeless bag of translated messages; it specifically translates the messages currently found in its template.
In python terms:
potemplate.pofiles = { language: pofile, } pofile.language = language pofile.potemplate = potemplate
(A gettext PO file is pretty much the same as a template file. A bit of metadata aside, the big difference is that a template leaves the translations blank.)
The currently translatable messages in a template (“pot message sets”) are kept in a numbered sequence. This sequence defines which messages need to be translated in the PO files. Messages that are no longer in the template are obsolete; we may still track them but they are no longer an active part of the template.
In python terms, think:
potemplate.potmsgsets = [potmsgset1] potemplate.obsolete_potmsgsets = set([potmsgset2])
Think of a translated string in a PO file as a translation message. This gets a bit more complicated once you start looking at the database schema, but from the perspective of a PO file it's accurate.
translation_message1.potmsgset = potmsgset1 translation_message1.language = pofile.language
A translation message can be current in a given PO file, or not. It's an emergent property of more complex shared data structures. So you can view a PO file as a customizable “view” on the current translations of a particular template into a given language.
pofile.current_translation_messages = { potmsgset1: translation_message1, }
Often a translation message translates a message from a PO file's template into the PO file's language, but is not current (from the perspective of that PO file). In that case we consider it a suggestion. We make it easy for users with the right privileges to select suggestions to become current translations.
pofile.suggestions = { potmsgset2: [translation_message2], }
Workflow
You guessed it: this still needs writing. This is a wiki page, so it's permanently Under Construction.
Suggestions and translations
Permissions and organization
Objects and schema
Processes
Import queue
Gardener
Export queue
Bazaar imports
Bazaar exports