Debtags Service
This is a LEP with one way of implementing what is needed for the Debtags LEP.
Contact: Jonathan Lange & James Westby
On Launchpad: Link to a blueprint, milestone or (best) a bug tag search across launchpad-project
Consider clarifying the feature by describing what it is not?
Link this from LEP
Rationale
See LEP/DebTags for why we want this.
The approach described here is inline with Launchpad's SOA strategy, and also allows us to more easily re-use the code from Debian for dealing with debtags.
Stakeholders
| Canonical Consumer Applications | Drafting | | Launchpad Technical Architect | Not yet |
Constraints and Requirements
Must
- Allow debtags info to be delivered along with the Ubuntu archive
- Allow for debtags info to be imported from Debian
- Allow for overrides of the information coming from Debian
- Either for packages added in Ubuntu, or where Ubuntu needs different/extra tags for a package.
Nice to have
- Allow debtag information to be included in PPAs
- Allow debtag information to be included in other hosted distributions (is this a MUST due to OEM use cases?)
- Support private distributions/PPAs (is this a MUST due to OEM use cases?)
Must not
- Significantly slow down publication
- Needlessly prevent publication from happening
Out of scope
- Editing debtags (though if we support PPAs how will people edit the debtags for the PPA?)
Success
How will we know when we are done?
- The Ubuntu main archive contains debtag information in the Packages file that doesn't come from the debian/control of a package.
How will we measure how well we have done?
- Slowdown to publishing cycle
- Number of times publishing fails due to debtags
Design
There will be a debtags service that knows (considering one archive for now) which tags should be set on each binary package name (package name + series combination?)
At publishing time Launchpad will include that information in the Packages file as needed, using an API provided by the debtags service to get the information
API Design
The debtags service will provide an API that Launchpad can use to get the information. As publishing works at the suite/component level that should be the smallest granularity that Launchpad will want to request information, as doing at a per-package granularity would lead to excessive round-trips. However, it may be that Launchpad would prefer to request information for the whole archive in one go.
- The info may get quite large, and so breaking it up may avoid hitting request limits in one of the pieces.
- Requesting at the component/series level may make it easier to restructure the publication cycle later if that is desired for other reasons.
- If PPAs are included then it may even be desirable to request info for several archives at once.
The API response is pretty straightforward, needing to return a mapping from package name -> debtags string, which Launchpad can insert in to the Packages file as needed. (Obviously if the API is per-archive and the tags are per-component the mapping would have to be (package name, component) -> debtags string)
Fault Tolerance
Publishing delays cause a lot of problems for both Ubuntu and those relying on PPAs, so all steps in the pubishing process need to be reliable. Having Launchpad contact an external service during publishing could jeopardise this.
However, we assume that once debtags are available they will be relied upon and so cannot be dropped from the Packages file if the service isn't available. This seems to suggest three possible approaches to make the system fault-tolerant: (TODO: decide between these)
Debtags database mirror
Launchpad's database could contain tables for holding the relevant information, and the publisher could just consult those, meaning that there isn't a dependency on an external service that could affect the publisher.
The mirror would then be kept up to date either by:
- Having the publisher make the API call and update the database at the start of its run, but continuing if the service is not responding.
- Messages sent from the debtags service on any changes.
This approach ensures that the publisher always runs promptly (assuming that timeouts on the API call are low), even if the debtags data is stale. However, it does have the overhead of having to add all the database tables to Launchpad and write the logic to keep the data in sync with the external service.
Falling back to old Packages files
An approach similar to the above, but without having to create the database tables is for the publisher to look at the old Packages files if the debtags service isn't responding.
At the start of its cycle the publisher would make the API request to the debtags service. If the request is successfully answered then the data is used. If it is not then the publisher parses the last-written Packages files and reads the needed debtags data from there.
Again this approach ensures that the publisher always runs promptly (assuming that timeouts on the API call are low and the time taken to parse the Packages files is negligible,) even if the debtags data is stale. However, this approach is likely easier to implement than duplicated database tables, at the expense of a little ugliness, and a few failure modes that the database approach doesn't have (e.g. publishing without debtags information if the debtags service is down *and* the Packages files have been deleted for some reason.)
HA
This approach is building the debtags service to always be available to respond to the API requests, and have the publisher abort if the debtags service is not responding (perhaps after a couple of retries.)
This approach puts more constraints on the implementation and adminsitration of the debtags service while simplifying the implemenation on the Launchpad side.
Implementation
Some more stuff here
Publishing
How does the publisher actually inject the information in to the Packages file?
Thoughts?
Put everything else here. Better out than in.