Soyuz Publishing

"Publishing" in Soyuz encompasses a range of activities done on a regular cycle:

The publishers

There are three publishers, one for Ubuntu, one for PPAs and a third for derivative distributions:

The Ubuntu publisher runs at the top of each hour, every hour. The PPA and derivatives publishers run every 5 minutes, or less often if the previous run takes longer than 5 minutes.

The PPA publisher is very straightforward; it's a wrapper script that calls process-accepted.py and then publish-distro,py.

The distro publishers are far more complicated. Here's a rough overview of the workflow.

Distribution publisher workflow

All of this is what the cronscripts/publisher-ftpmaster.py wrapper script does. It defers to other scripts (script objects actually, we don't Popen() new processes) to do the meat of the work. Consider it the conductor for the publishing orchestra, if you like.

publish-distro.py

The vast bulk of the work happens in this script. It does the intricate work of assembling a Debian-style repository based on the metadata and files available in Launchpad's database. It goes through various phases of operation, each method on the publisher script called in turn:

Death row processing

In a separate cron job, but still considered part of the publisher, we run scripts/process-death-row.py (every 30 minutes for PPAs or hourly for distros) which will examine all the superseded sources and work out if we can remove their files from the pool yet. There are many considerations, such as GPL conformance and whether the same source is still published in a different place in the archive.

After a file is condemned, it gets a stay of execution of around a day or so, after which it will be permanently removed from the archive.

Cleaning up the librarian

Removed files remain in the librarian for longer, but there is a script called cronscripts/expire-archive-files.py which currently only processes PPAs and removes files from the librarian 7 days after the package files are removed from the archive repository.

Distribution files in the librarian are currently only removed when a distribution series goes obsolete at its end of life. This is currently a manual task requiring SQL.

Soyuz/TechnicalDetails/Publishing (last edited 2011-09-12 16:12:15 by julian-edwards)