Debtags Service
This is a LEP with one way of implementing what is needed for the Debtags LEP.
Contact: Jonathan Lange & James Westby
On Launchpad: Link to a blueprint, milestone or (best) a bug tag search across launchpad-project
Consider clarifying the feature by describing what it is not?
Link this from LEP
Rationale
See LEP/DebTags for why we want this.
The approach described here is inline with Launchpad's SOA strategy, and also allows us to more easily re-use the code from Debian for dealing with debtags.
Stakeholders
| Canonical Consumer Applications | Drafting | | Launchpad Technical Architect | Not yet |
Constraints and Requirements
Must
- Allow debtags info to be delivered along with the Ubuntu archive
- Allow for debtags info to be imported from Debian
- Allow for overrides of the information coming from Debian
- Either for packages added in Ubuntu, or where Ubuntu needs different/extra tags for a package.
Nice to have
- Allow debtag information to be included in PPAs
- Allow debtag information to be included in other hosted distributions (is this a MUST due to OEM use cases?)
- Support private distributions/PPAs (is this a MUST due to OEM use cases?)
Must not
- Significantly slow down publication
- Needlessly prevent publication from happening
Out of scope
- Editing debtags (though if we support PPAs how will people edit the debtags for the PPA?)
Success
How will we know when we are done?
- The Ubuntu main archive contains debtag information in the Packages file that doesn't come from the debian/control of a package.
How will we measure how well we have done?
- Slowdown to publishing cycle
- Number of times publishing fails due to debtags
Design
There will be a debtags service that knows (considering one archive for now) which tags should be set on each binary package name (package name + series combination?)
At publishing time Launchpad will include that information in the Packages file as needed, using an API provided by the debtags service to get the information
API Design
The debtags service will provide an API that Launchpad can use to get the information. Publishing works at the suite/component level, so that is the smallest set for which Launchpad will want to request information. Doing at a per-package level would lead to excessive round-trips. However, it may be that Launchpad would prefer to request information for the whole archive in one go.
- The info may get quite large, and so breaking it up may avoid hitting request limits in one of the pieces.
- Requesting at the component/series level may make it easier to restructure the publication cycle later if that is desired for other reasons.
- If PPAs are included then it may even be desirable to request info for several archives at once.
The API response is pretty straightforward, needing to return a mapping from package name -> debtags string, which Launchpad can insert in to the Packages file as needed. (Obviously if the API is per-archive and the tags are per-component the mapping would have to be (package name, component) -> debtags string)
Fault Tolerance
Publishing delays cause a lot of problems for both Ubuntu and those relying on PPAs, so all steps in the pubishing process need to be reliable. Having Launchpad contact an external service during publishing could jeopardise this.
However, we assume that once debtags are available they will be relied upon and so cannot be dropped from the Packages file if the service isn't available. This seems to suggest three possible approaches to make the system fault-tolerant: (TODO: decide between these)
Debtags database mirror
Launchpad's database could contain tables for holding the relevant information, and the publisher could just consult those, meaning that there isn't a dependency on an external service that could affect the publisher.
The mirror would then be kept up to date either by:
- Having the publisher make the API call and update the database at the start of its run, but continuing if the service is not responding.
- Messages sent from the debtags service on any changes.
This approach ensures that the publisher always runs promptly (assuming that timeouts on the API call are low), even if the debtags data is stale. However, it does have the overhead of having to add all the database tables to Launchpad and write the logic to keep the data in sync with the external service.
Falling back to old Packages files
An approach similar to the above, but without having to create the database tables is for the publisher to look at the old Packages files if the debtags service isn't responding.
At the start of its cycle the publisher would make the API request to the debtags service. If the request is successfully answered then the data is used. If it is not then the publisher parses the last-written Packages files and reads the needed debtags data from there.
Again this approach ensures that the publisher always runs promptly (assuming that timeouts on the API call are low and the time taken to parse the Packages files is negligible,) even if the debtags data is stale. However, this approach is likely easier to implement than duplicated database tables, at the expense of a little ugliness, and a few failure modes that the database approach doesn't have (e.g. publishing without debtags information if the debtags service is down *and* the Packages files have been deleted for some reason.)
HA
This approach is building the debtags service to always be available to respond to the API requests, and have the publisher abort if the debtags service is not responding (perhaps after a couple of retries.)
This approach puts more constraints on the implementation and adminsitration of the debtags service while simplifying the implemenation on the Launchpad side.
Implementation
Publishing
How does the publisher actually inject the information in to the Packages file?
Services Requirements
This section deals with how the service would meet all of the requirements set out in ArchitectureGuide/ServicesRequirements
No downtime Deployability
This could be achieved using dual servers and haproxy. Care would need to be taken over database migrations.
No single point of failure (SPOF)
With dual servers and haproxy an SPOF could be avoided. However, as discussed above the service needs to be up in some form to keep publishing going, unless one of the alternative stragegies outlined there is used.
Uptime and KPI monitoring
From the Launchpad point of view the primary KPI is whether the service is up, with the response time also being important. If syncing of tags from Debian is performed then monitoring of that would be needed.
KPI graphing
haproxy could provide the basic information for graphing, with specific APIs being added for other desired information.
Access logs
The service used by Debian is built on Django. Assuming we go with that we should have access logs with little effort.
Error reporting
Again if Debian's Django-based system is used then oops-wsgi can be used to get error reports.
Access controls
If private PPAs are not supported then the interface used by Launchpad could be a public read-only API. If they are it will have to be authenticated.
The interface for editing tags would have to be authenticated either way, but discussion of that interface is not included here.
Thoughts?
Put everything else here. Better out than in.