Upstream/Ubuntu sharing of messages
This is one approach to make Ubuntu packages and upstream projects registered in Launchpad share translations.
Change summary
We add a foreign key "upstream_potmsgset" to POTMsgSet. In the original draft it was called fallback_potmsgset, hence the name of this page.
(Also available as dia source)
Requirements: at most 1 known Launchpad project for each Ubuntu package. This is a {0,1} : n relationship.
In templates belonging to a project, upstream_potmsgset would be null. In an Ubuntu package it would refer to a POTMsgSet (identical to potmsgset) in an upstream project for the package, if any.
When finding current translations for page display or export in Ubuntu, we look for the first current TranslationMessage with the given language for:
the given POTMsgSet and POTemplate, which would be a diverged message (unchanged)
the given POTMsgSet and a null POTemplate, which would be a shared message (unchanged)
a null POTemplate and the upstream_potmsgset for the given POTMsgSet (new).
optionally, a "chained" lookup on the upstream_potmsgset's upstream_potmsgset.
This will pick up only shared TranslationMessages from upstream. The original version of this proposal put upstream_potmsgset in TranslationTemplateItem, allowing for more detailed control. But I think any differences between sharing templates would be more likely to be produced by mistake than for any good reason.
Suggestions for one POTMsgSet will show up on the other as external suggestions. I don't think it's worth giving them special status. The knowledge that a translation has been suggested but not yet reviewed in the same place upstream shouldn't influence the reviewer's choice more than the quality and suitability of the translation itself.
The primary use-case for all this is the one where an Ubuntu package uses a Launchpad project as a source of upstream translations, but there is no reason why it should be limited to that. A package may also use another package as an upstream, or a project another project. There may even be a longer chain: Launchpad-managed project A has an Ubuntu-specific derivative project B, which is in Ubuntu as package C, but package D has been split out of it in the latest Ubuntu release and also reuses its translations. If we allow chaining of upstream_potmsgsets, we'll have reasonably manageable one-way "sharing" of translations all along this chain.
Replacing "imported"
It should be possible to replace the is_imported attribute on Ubuntu TranslationMessages with one that emerges in context: "did we get this TranslationMessage from the upstream_potmsgset?" All upstream imports (and/or UI translation, if upstream uses Launchpad) go into projects, and are included in Ubuntu only through sharing. For upstream projects, as far as we know, "published" has no real practical meaning. We have used it sometimes as a way to hack around import semantics.
This option would take an entire dimension out of TranslationMessage management. There would be no explicit interaction between is_imported and is_current, leaving us with a simpler problem domain (yay!). We'd still manage the is_current flag on the project, and we'd manage another is_current flag on the package, but the two would be completely separate. The much-feared updateTranslation code would lose one complicating parameter. And we could finally get rid of the confusing "is this translation published elsewhere" option on POFile upload pages.
Questions for this option:
Are there any real, well-understood use-cases for is_imported in projects, or is it just an unwanted extra colour in the statistics bars?
- Do we have any licensing impact to deal with?
- Can we get absolutely all the Ubuntu imports into project templates?
I believe Danilo is looking into this option (or something close to it) in more detail.
Pros and Cons
Advantages:
- Upstream translations appear instantly in Ubuntu.
We get fewer TranslationMessages, not more.
- Ubuntu can diverge from upstream where the two differ in translation style or vocabulary.
- Ubuntu-specific translations can still be shared between Ubuntu releases.
- Strings specific to the Ubuntu package show up as green or red in the statistics, never blue or purple.
- Ubuntu can make string changes to Launchpad-native projects by layering an intermediate Launchpad package over it.
Disadvantages:
- Schema change required.
- Data migration is desirable.
Complaints here please.
Detailed changes
The concepts of message sharing stay unchanged. A sharing template is still one within either the same package, or the same project. In the message-sharing sense, there is no sharing between Ubuntu and upstream. What happens between Ubuntu and upstream is more like a "glass plate" than like message sharing.
Translation review
This includes anything that activates a TranslationMessage: selection in the UI by a reviewer, or its creation by someone with edit privileges (who did not check the "someone should review this translation" checkbox), or import by someone with edit privileges.
Review still works primarily on the translation that's being reviewed, with unchanged the exception that a change involving message sharing will affect sharing templates. There is one change: if the translation being approved is identical to the current shared upstream message, we timestamp the suggestion as reviewed but do not set its is_current flag. Instead, we clear the is_current flag for any existing translation. This will unmask the current shared upstream message: Ubuntu has converged with upstream. If we decide to keep the is_imported flag, we give it the same treatment.
Template import
Assumption: we will still import templates separately into Ubuntu, which would allow for string changes caused by Ubuntu patches.
New POTMsgSet downstream from existing POTMsgSet
This happens when a template change flows downstream into Ubuntu. When creating a new POTMsgSet, we can follow one of two algorithms:
(a) Match by template.
- Look up a templates sharing subset in the upstream project that the current template would fit into if it were in the same project.
- In that sharing subset, look for a matching POTMsgSet.
Create a new POTMsgSet with whatever was found upstream as the upstream_potmsgset.
(b) Match by string.
Look up matching POTMsgSets in the entire upstream project.
- If there are multiple matches, pick a "best" one (active/inactive templates, most recent upstream choice, closest template name match, most translation activity).
Use whatever we end up with as the new POTMsgSets upstream_potmsgset.
The latter would allow Ubuntu to arrange POTMsgSets differently between templates. Use cases:
- Upstream splits templates, merges templates, or moves strings between templates but the change has not filtered down to Ubuntu yet.
- Ubuntu splits an upstream template, as it does with firefox (where xulrunner is a separate package coming off the same template).
In fact the same might be worth considering for message sharing as well, but we can treat that as a separate issue.
New POTMsgSet upstream from existing POTMsgSet
This happens when an Ubuntu string change makes its way back upstream.
When creating a new POTMsgSet, the importer should:
- Look for packages (or projects) that might use the current project (or package) as a direct upstream.
Find matching POTMsgSets there (see above).
For any that don't have an upstream_potmsgset yet, set the new one.
Open questions:
What do we do if a downstream POTMsgSet already has a different upstream_potmsgset set?
Changes-only export
The "Launchpad changes only" exports would filter its results by ignoring POTMsgSets that have the same translation when ignoring upstream_potmsgset as they do when including upstream lookup.
It's not enough simply to ignore upstream_potmsgset and export only the messages with their own translations:
Transition period. We may go through a migration period where an Ubuntu POTMsgSet can still have a translation identical to the translation of its upstream_potmsgset.
- Double divergence. Ubuntu may have its own translations, and then one specific Ubuntu version may have a diverged translation that is identical to the shared upstream translation.
Unchanged
I can't think of anything that would have to change in these:
- Adding suggestions.
- Displaying suggestions.
- Template creation.
POFile creation.
Creating or changing TranslationTemplateItems.
- Regular export.
Migration
We would once again want to eliminate duplicate TranslationMessages just like we're doing with message sharing. We should aim to introduce the code changes before starting to import large numbers of translations into projects. That way we minimize the creation of unnecessary TranslationMessages, and eliminate duplicates only where projects were already being translated natively in Launchpad.
It will still be worthwhile to get rid of unnecessary is_imported TranslationMessages. After all this is the bulk of Ubuntu translations and we'd be duplicating most of them. Once upstream projects have been imported, we can eliminate any downstream TranslationMessage d where:
d is not current and contains the same translations as some TranslationMessage for its upstream_potmsgset and language.
d is a current, shared translation and contains the same translations as its current upstream POTMsgSet.