Diff for "Soyuz/TechnicalDetails"

Not logged in - Log In / Register

Differences between revisions 1 and 2
Revision 1 as of 2009-10-21 11:16:20
Size: 7427
Editor: jml
Comment:
Revision 2 as of 2009-11-17 15:23:35
Size: 7870
Comment: Notes needing cleaning up.
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
Line 29: Line 28:
|| '''Upload Parsing''' || {{{scripts/process-upload.py}}} || Run via cron every 5 minutes. This takes the content in the directory created by Poppy and parses it as a package upload. Various things are validated such as the presence of a changes file, GPG key etc. process-upload.py is a small file itself, the real work is done in changesfile.py, customupload.py, ddtp_tarball.py, debian_installer.py, dist_upgrader.py, dscfile.py, nascentupload.py, nascentuploadfile.py, uploadpolicy.py and uploadprocessor.py. <<BR>> This stage creates some database entries: pairs of `PackageUpload` and one of `DistroReleaseQueue{Build,Custom,Binary}` depending on the type of upload. Also created are `SourcePackageRelease`, `BinaryPackageRelease`, `SourcePackageReleaseFile`, `BinaryPackageReleaseFile`, `SourcePackageReleaseName` and `BinaryPackageReleaseName`. Which ones that are created depends on the type of package that was uploaded and what was in it (sources and/or binaries). <<BR>> Any uploaded files are added to the librarian. || || '''Upload Parsing''' || {{{scripts/process-upload.py}}} || Run via cron every 5 minutes. This takes the content in the directory created by Poppy and parses it as a package upload. Various things are validated such as the presence of a changes file, GPG key etc. process-upload.py is a small file itself, the real work is done in changesfile.py, customupload.py, ddtp_tarball.py, debian_installer.py, dist_upgrader.py, dscfile.py, nascentupload.py, nascentuploadfile.py, uploadpolicy.py and uploadprocessor.py. <<BR>> This stage creates some database entries: pairs of `PackageUpload` and one of `DistroReleaseQueue{Build,Custom,Binary}` depending on the type of upload (For a PPA upload, the PackageUpload is automatically in the 'Accepted' state, otherwise New->Accepted->Done).
Also created are `SourcePackageRelease`, `BinaryPackageRelease`, `SourcePackageReleaseFile`, `BinaryPackageReleaseFile`, `SourcePackageReleaseName` and `BinaryPackageReleaseName`. Which ones that are created depends on the type of package that was uploaded and what was in it (sources and/or binaries). <<BR>> Any uploaded files are added to the librarian. ||
Line 31: Line 31:
|| '''Final acceptance''' || `scripts/process-accepted.py` || This is an hourly cron job script that takes {{{DistroReleaseQueue}}} rows with a status of ACCEPTED and creates corresponding {{{SecureSourcePackagePublishingHistory}}} and {{{SecureBinaryPackagePublishingHistory}}} rows. This script will also take custom uploads, unpack its tar file and add resulting files to the archive. || || '''Final acceptance''' || `scripts/process-accepted.py` || This is an hourly cron job script that takes {{{DistroReleaseQueue}}} rows with a status of ACCEPTED and creates corresponding {{{SecureSourcePackagePublishingHistory}}} and {{{SecureBinaryPackagePublishingHistory}}} rows (depending on the upload). This script will also take custom uploads, unpack its tar file and add resulting files to the archive. ||

publish-distro.py will look for pending publishing histories, take the corresponding files from the librarian and publish them in the archive (setting the publishing history to 'Published').
Line 47: Line 49:
The Slave Scanner will look for {{{BuildQueue}}} entries and will find idle machines to run new jobs on and essentially keeps track of a build's progress and which slave it's building on. When the build has finished, it will call process-upload.py to get the resulting binaries published. The Builddmanager will look for {{{BuildQueue}}} entries and will find idle machines to run new jobs on and essentially keeps track of a build's progress and which slave it's building on. When the build has finished, it will call process-upload.py to get the resulting binaries published.

Line 51: Line 55:
Publishing is run from cron at intervals of 1 hour using the humorously-named file {{{cronscripts/publishing/cron.daily}}}. This will do the following: Publishing is run from cron at intervals of 1 hour using the file {{{cronscripts/publishing/cron.publish}}}. This will do the following:
Line 56: Line 60:

== Current issues ==

 * buildd-manager putting binaries back on the FTPMaster blocks (BuilddManagerUploadDecoupling).

This page gives you a technical overview of Soyuz. However, it's terribly out of date. Please, please fix it.

Soyuz Technical Overview

Soyuz is a distribution package management system for Launchpad, encompassing the build system, package management and archive publishing. It allows users to upload packages, have them built on a variety of processor architectures and then published for others to download.

Whenever you upload a package to Ubuntu, or need build information for that package, or download a package from the archive, you are using Soyuz.

UPLOAD + BUILD + PUBLISH = SOYUZ 

Workflow

TBD.

Database Overview

text Soyuz Model

(If you want to edit this diagram, the source Dia file is here: SoyuzDatabase.dia)

Uploading

Uploading is done in several, discrete steps:

Uploading Stages

Poppy Server

daemons/poppy-upload.py

This is a Zope3 script FTP daemon. It takes an FTP upload and creates a directory containing the upload's content.

|| Upload Parsing || scripts/process-upload.py || Run via cron every 5 minutes. This takes the content in the directory created by Poppy and parses it as a package upload. Various things are validated such as the presence of a changes file, GPG key etc. process-upload.py is a small file itself, the real work is done in changesfile.py, customupload.py, ddtp_tarball.py, debian_installer.py, dist_upgrader.py, dscfile.py, nascentupload.py, nascentuploadfile.py, uploadpolicy.py and uploadprocessor.py.
This stage creates some database entries: pairs of PackageUpload and one of DistroReleaseQueue{Build,Custom,Binary} depending on the type of upload (For a PPA upload, the PackageUpload is automatically in the 'Accepted' state, otherwise New->Accepted->Done). Also created are SourcePackageRelease, BinaryPackageRelease, SourcePackageReleaseFile, BinaryPackageReleaseFile, SourcePackageReleaseName and BinaryPackageReleaseName. Which ones that are created depends on the type of package that was uploaded and what was in it (sources and/or binaries).
Any uploaded files are added to the librarian. ||

Vetting

lib/canonical/launchpad/scripts/queue.py

This is a manual process. It allows a real person to check that an uploaded package is valid. This will change its queue state from NEW to ACCEPTED once done. In certain circumstances process-upload.py will have set the status to ACCEPTED immediately, e.g. when the submitter/package is already trusted.

Final acceptance

scripts/process-accepted.py

This is an hourly cron job script that takes DistroReleaseQueue rows with a status of ACCEPTED and creates corresponding SecureSourcePackagePublishingHistory and SecureBinaryPackagePublishingHistory rows (depending on the upload). This script will also take custom uploads, unpack its tar file and add resulting files to the archive.

publish-distro.py will look for pending publishing histories, take the corresponding files from the librarian and publish them in the archive (setting the publishing history to 'Published').

Building

(If you want to edit this diagram, the source Dia file is here: SoyuzBuilders.dia)

As can be seen from the diagram, a build is handed off to one of a number of machines on separate machines, which will perform the build in a chroot on the appropriate processor architecture.

At the top level, building is controlled by a "build sequencer" daemon scripts/ftpmaster-tools/buildd-sequencer which is a shell script that starts a Twisted Application. <XXX Add link to Twisted> The sequencer will control the starting of two other scripts, a Queue Builder cronscripts/buildd-queue-builder.py and a Slave Scanner cronscripts/buildd-slave-scanner.py with the correct sequence. A special sequence is required because there can be some times when builds will fail in a non-fatal manner (e.g. missing dependencies) and can be re-tried.

The Queue Builder is responsible for creating database entries in Build and BuildQueue. Build entries are persistent to maintain a history but can pass through several states. BuildQueue entries exist as long as the Queue Builder thinks that a slave builder is working on a build. <XXX maybe add more details about scoring, retries, etc.?>

The Builddmanager will look for BuildQueue entries and will find idle machines to run new jobs on and essentially keeps track of a build's progress and which slave it's building on. When the build has finished, it will call process-upload.py to get the resulting binaries published.

Publishing

Publishing is run from cron at intervals of 1 hour using the file cronscripts/publishing/cron.publish. This will do the following:

  • Take files from the librarian and put them in the archive's pool tree.
  • Generate the archive's dist tree (the archive's indexes) using apt-ftparchive.
  • Expire old packages by giving them a status of SUPERSEDED. (domination.py)

  • Looks for packages no longer referenced by any archive index, delete the files on disk and set the package status to REMOVED. (deathrow.py)

Current issues

Documents / Notes

Glossary

Terms used in Soyuz that might confuse outsiders!

Term

Description

BPR

BinaryPackageRelease. A database table that stores details of a binary package at a particular version.

BPRF

BinaryPackageReleaseFile. A database table that links a BinaryPackageReleasr to all its files in the librarian.

(S)BPPH

(Secure)BinaryPackagePublishingHistory. See (S)SPPH, except this is for binary packages.

ChangesFile

Part of a package upload, this file describes the upload (e.g. the file list, checksums, maintainer/changer names).

Deathrow

Process all files for a publishing marked as PENDINGREMOVAL. If they are not used by another package they are removed and the publishing state is set to REMOVED.

Domination

The act of modifying publishing records to a state of SUPERSEDED if more recent publishings exist.

p-a

process-accepted.py - a script that takes accepted uploads and creates publishing records for them.

Poppy

An FTP server for package uploads.

p-u

process-upload.py - a script that scans the files stored by Poppy and attempts to load them into the database.

PU

PackageUpload. A database table that stores details from the changes file for an upload.

SPR

SourcePackageRelease. A database table that holds details of a source package at a particular version.

SPRF

SourcePackageReleaseFile. A database table that links a SourcePackageRelease to all its files in the librarian.

(S)SPPH

(Secure)SourcePackagePublishingHistory. A database table that records current and historical publishing status for SourcePackageRelease. The non-secure variant is just a view on the secure table, where embargoed is not false.

ogre-model

it's a concept that controls build-dependencies in a layered model, i.e. sources published in 'main' component can only fetch 'build-dependencies' from the 'main' component; sources published in 'universe' only have access to 'main & universe', and so on.

Soyuz/TechnicalDetails (last edited 2015-01-21 11:36:30 by cjwatson)