Diskless Apt Archives
Serve PPA files directly from the librarian, rather than from a single machine's multi-terabyte filesystem
Contact: William Grant
On Launchpad: https://bugs.launchpad.net/launchpad-project/+bugs?field.tag=diskless-archives
Rationale
This project aims to solve scaling and latency problems that impact the user experience for publishing and consuming software for a fee.
Such software is built into commercial PPAs, which like all PPAs are currently hosted on a single machine germanium. Access to a specific piece of software is granted by the system writing a separate access control file in each archive, which occurs an arbitrary amount of time after the API call to grant access completes, leading to delays and a poor experience.
The current architecture leads to a very large footprint for any machine wanting to be ppa.launchpad.net: it needs multiple TB of disk, and enough IO and CPU bandwidth to run all the PPA maintenance functions (uploading, publication, access control, log analysis). We are struggling with load today, and while a newer machine would defer that struggle, the lack of a scaling story means it would be only moderate amount of time before we face the same problem again, but without the ability to fix it by upgrading. uploading is not (at present) a scaling problem for us, though it is bound to the same hostname which means we need to change how uploading is handled to be able to scale ppa.launchpad.net.
The project will be successful if our sysadmins can easily and effectively add capacity to handle rapid and substantial increases in the number of PPAs and number of users of PPAs, and software centre users get their purchases immediately without hassles (introduced by Launchpad).
Stakeholders
- Consumer Apps
- IS
User stories
As a Software Center customer
I want my download to start immediately after purchase
so that I can use my new application as soon as possible.
As a Software Center customer
I want my downloads to be quick
so that I can use my new application as soon as possible.
As a package uploader
I want my archive to be updated quickly
so that my users can use my new package as soon as possible.
As a commercial application provider
I want Launchpad PPAs to scale easily to cope with my app's downloads
so that I can worry about more important things than distribution.
As a Launchpad sysadmin
I want to add new PPA download capacity easily and rapidly on modest hardware
so that I can quickly respond to and mitigate high load situations.
Constraints and Requirements
Must
Allow ppa.launchpad.net HTTP(S) download frontends to scale to handle additional load without service disruption.
- Let private PPAs scale to millions of subscribers.
- Let private PPAs scale to additional 10's of thousands of archives.
- Permit people access to private PPAs immediately after activating their subscription.
Commission and activate a new scalable ppa.launchpad.net node in less than one hour (after base OS install).
- Run scalable nodes on our stock hardware build without requiring special RAM or disk configuration.
- Retain interoperability with PPA access from all supported versions of Ubuntu (hardy and up at time of writing).
Nice to have
- Scale PPA archive publication by adding machines.
- Scale PPA uploading and upload processing by adding machines.
- Reduce PPA package publication delay.
Decrease or eliminate downtime for ppa.launchpad.net. Ideally it becomes a regular nodowntime target.
Must not
Break compatibility with existing apt sources.list entries, including private PPA credentials.
- Interfere with other parts of Launchpad (e.g. PPA statistics, Ubuntu main and universe being regular archives on disk)
Undesirable
Doing a locksteap break of (S)FTP uploads to the existing overloaded ppa.launchpad.net hostname. There are 4000 distinct uploaders over all of 2011, so contacting them is doable if we need to. We have a long term desire to move PPA hosting to its own domain, like the librarian is.
Success
How will we know when we are done?
We can seamlessly increase capacity to handle additional PPA downloads, without downtime or other service disruption.
Users can download packages from private PPAs immediately after activating their subscription.
How will we measure how well we have done?
- SC stops seeing user pain related to the performance (download rate, subscription activation latency) of ppa.launchpad.net.
- E.g. no more bug reports or questions.
Publication latency for all PPAs drops down to <= 60 seconds 99% of the time.
- Upload latency remains constant or decreases.
Thoughts?
See also design and implementation notes