Soyuz/TechnicalDetails

Not logged in - Log In / Register

Revision 3 as of 2009-11-25 23:45:49

Clear message

This page gives you a technical overview of Soyuz. However, it's terribly out of date. Please, please fix it.

Soyuz Technical Overview

Soyuz is a distribution package management system for Launchpad, encompassing the build system, package management and archive publishing. It allows users to upload packages, have them built on a variety of processor architectures and then published for others to download.

Whenever you upload a package to Ubuntu, or need build information for that package, or download a package from the archive, you are using Soyuz.

UPLOAD + BUILD + PUBLISH = SOYUZ 

Workflow

TBD.

Database Overview

Soyuz Model

(If you want to edit this diagram, the source Dia file is here: SoyuzDatabase.dia)

Uploading

Uploading is done in several, discrete steps:

Uploading Stages

Poppy Server

daemons/poppy-upload.py

This is a Zope3 script FTP daemon. It takes an FTP upload and creates a directory containing the upload's content.

|| Upload Parsing || scripts/process-upload.py || Run via cron every 5 minutes. This takes the content in the directory created by Poppy and parses it as a package upload. Various things are validated such as the presence of a changes file, GPG key etc. process-upload.py is a small file itself, the real work is done in changesfile.py, customupload.py, ddtp_tarball.py, debian_installer.py, dist_upgrader.py, dscfile.py, nascentupload.py, nascentuploadfile.py, uploadpolicy.py and uploadprocessor.py.
This stage creates some database entries: pairs of PackageUpload and one of DistroReleaseQueue{Build,Custom,Binary} depending on the type of upload (For a PPA upload, the PackageUpload is automatically in the 'Accepted' state, otherwise New->Accepted->Done). Also created are SourcePackageRelease, BinaryPackageRelease, SourcePackageReleaseFile, BinaryPackageReleaseFile, SourcePackageReleaseName and BinaryPackageReleaseName. Which ones that are created depends on the type of package that was uploaded and what was in it (sources and/or binaries).
Any uploaded files are added to the librarian. ||

Vetting

lib/canonical/launchpad/scripts/queue.py

This is a manual process. It allows a real person to check that an uploaded package is valid. This will change its queue state from NEW to ACCEPTED once done. In certain circumstances process-upload.py will have set the status to ACCEPTED immediately, e.g. when the submitter/package is already trusted.

Final acceptance

scripts/process-accepted.py

This is an hourly cron job script that takes DistroReleaseQueue rows with a status of ACCEPTED and creates corresponding SecureSourcePackagePublishingHistory and SecureBinaryPackagePublishingHistory rows (depending on the upload). This script will also take custom uploads, unpack its tar file and add resulting files to the archive.

publish-distro.py will look for pending publishing histories, take the corresponding files from the librarian and publish them in the archive (setting the publishing history to 'Published').

Building

(If you want to edit this diagram, the source Dia file is here: SoyuzBuilders.dia)

As can be seen from the diagram, a build is handed off to one of a number of machines on separate machines, which will perform the build in a chroot on the appropriate processor architecture.

At the top level, building is controlled by a "build sequencer" daemon scripts/ftpmaster-tools/buildd-sequencer which is a shell script that starts a Twisted Application. <XXX Add link to Twisted> The sequencer will control the starting of two other scripts, a Queue Builder cronscripts/buildd-queue-builder.py and a Slave Scanner cronscripts/buildd-slave-scanner.py with the correct sequence. A special sequence is required because there can be some times when builds will fail in a non-fatal manner (e.g. missing dependencies) and can be re-tried.

The Queue Builder is responsible for creating database entries in Build and BuildQueue. Build entries are persistent to maintain a history but can pass through several states. BuildQueue entries exist as long as the Queue Builder thinks that a slave builder is working on a build. <XXX maybe add more details about scoring, retries, etc.?>

The Builddmanager will look for BuildQueue entries and will find idle machines to run new jobs on and essentially keeps track of a build's progress and which slave it's building on. When the build has finished, it will call process-upload.py to get the resulting binaries published.

Publishing

Publishing is run from cron at intervals of 1 hour using the file cronscripts/publishing/cron.publish. This will do the following:

Current issues

Documents / Notes

Glossary

Terms used in Soyuz that might confuse outsiders!

Term

Description

BPR

BinaryPackageRelease. A database table that stores details of a binary package at a particular version.

BPRF

BinaryPackageReleaseFile. A database table that links a BinaryPackageReleasr to all its files in the librarian.

(S)BPPH

(Secure)BinaryPackagePublishingHistory. See (S)SPPH, except this is for binary packages.

ChangesFile

Part of a package upload, this file describes the upload (e.g. the file list, checksums, maintainer/changer names).

Deathrow

Process all files for a publishing marked as PENDINGREMOVAL. If they are not used by another package they are removed and the publishing state is set to REMOVED.

Domination

The act of modifying publishing records to a state of SUPERSEDED if more recent publishings exist.

p-a

process-accepted.py - a script that takes accepted uploads and creates publishing records for them.

Poppy

An FTP server for package uploads.

p-u

process-upload.py - a script that scans the files stored by Poppy and attempts to load them into the database.

PU

PackageUpload. A database table that stores details from the changes file for an upload.

SPR

SourcePackageRelease. A database table that holds details of a source package at a particular version.

SPRF

SourcePackageReleaseFile. A database table that links a SourcePackageRelease to all its files in the librarian.

(S)SPPH

(Secure)SourcePackagePublishingHistory. A database table that records current and historical publishing status for SourcePackageRelease. The non-secure variant is just a view on the secure table, where embargoed is not false.

ogre-model

it's a concept that controls build-dependencies in a layered model, i.e. sources published in 'main' component can only fetch 'build-dependencies' from the 'main' component; sources published in 'universe' only have access to 'main & universe', and so on.