Not logged in - Log In / Register

Derived Distros Feature Architecture

This document explains how the Derived Distros feature works, showing the workflow and major architectural components.

What is a Derived Distro?

The feature aims to help people who want to build a distribution that is based on Ubuntu; these types of distributions are termed "derivatives" of Ubuntu. The "parent" distribution can in fact be any distribution, not just Ubuntu, but it must be completely hosted in Launchpad.

This means that someone could in fact create a derivative of Ubuntu which itself is then used to derive another distribution.

Ok, where do I start?

There are several steps involved in creating a new derived distribution:

Overlay Distributions

At this point the driver also needs to know whether this is an overlay distribution or not. An overlay distribution is one that depends on its parent's packages for building its own (much like a PPA depends on Ubuntu). When setting which parent(s) to use for initialization, if the parent is an overlay then the driver must also set which component and pocket we use for build dependencies.

Component and pockets have their own dependency chain so setting these parent dependencies will have a profound effect on what can be built in the derived series so must be set carefully. For example, if "main" is chosen as the component, then no packages outside of main will be available. If "universe" is chosen, all of universe, restricted and main will be available. This dependency chain is summarised in this table:

    Component  | Dependencies
          main | main
    multiverse | main restricted universe multiverse
       partner | partner
    restricted | main restricted
      universe | main universe

Similarly for the pocket dependency:

    Pocket    | Dependencies

If the driver sets up more than one parent series, then they must be ordered. The ordering tells the builders in which order to write the parent distroseries into sources.list and will affect where the dependencies are pulled from in the even that the same dependency is in more than one parent series.

How does initialization work?

The +initseries page does very little work itself. It creates an InitialiseDistroseriesJob which will do all the work of copying the packages, packagesets, architectures, builds, packaging links and everything else necessary to set up an initial Launchpad distribution.

Once the initialization has been set in motion, it happens in the background and may take anything from a few minutes to a few hours depending on the number of packages being copied.

While this job is running, the distroseries page will show that initialization is in progress. When the job finishes, it creates one or more rows in the DistroSeriesParent table which is the indicator to the rest of Launchpad that this is a derivative distribution and this distroseries inherited its packages from the other distroseries set in the table rows, one row for each parent distroseries. The table also contains the overlay, component and pocket selections.

After initialization, visualization

After the distroseries has been initialised, the distroseries page will show some information that looks like this:

Derived from Sid
    95 packages with differences
    2686 packages only in Sid
    2031 packages only in Oneiric

This is a fairly self-explanatory summary of the differences between the parent(s) and the new series. Each line is clickable and takes you to one of three new pages that shows each package difference in detail.

Source package differences page (+localpackagediffs)

This page shows packages that are in both series but differ in version somehow. There is a search form that allows you to search by an optional package name and a radio button setting that shows what sort of differences you are looking for:

Packages in this list have a state that can be "non-ignored", "ignored" or "resolved".

Taking each in turn:

The important thing to note is that the default "non-ignored packages" setting will only contain packages that the typical archive admin will be interested in dealing with as they form the most common cases that need to be dealt with and synchronised to the derived series.

Any differences where the derived version is higher than the parent are auto-ignored because there is nothing to synchronize from the parent - it is just carrying a local patch that is not pushed upstream yet.

Drilling down to each package

Click the green package name to open up a new section underneath. It shows the binaries that the source builds and their descriptions, the "last common version" for the difference and a link to generate "debdiffs". The last common version (otherwise known as the base version) is the version that was detected as being the version at which the package diverged in one or both of the series and is used when calculating debdiffs.

Clicking "Compute differences from last common version" will send an AJAX request to the server to initiate a debdiff calculation between the base version and each of the versions in the parent and derived packages, where applicable (this uses the existing PackageDiff process and is not part of this feature.) When the debdiffs are complete, the page will automatically update and show link(s) to the file containing the diff(s).

If you are someone who has any upload permissions to the distribution, you are additionally allowed to add comments about this difference, which show up at the bottom of the expanded section in the familiar bug-style conversation.

How the differences are calculated

All of the differences are tracked in a database table called DistroSeriesDifference:

Its contents are updated whenever the derivative has a new source publication, or any of its parents have a new source publication in one of the packagesets that we initialized with.

Because calculating the base_version requires the code to examine the changelogs for the packages, the actual update is done in a Job. When the new source publication happens, a corresponding DistroSeriesDifferenceJob is created. At some point in the near future the job runner will pick this up and re-calculate all of the changes required in the DistroSeriesDifference table.

While a job is waiting to be run, the page will also show "Updating ..." next to the relevant package name.

Synchronizing changes

Depending on your permissions, one or two buttons may appear at the bottom of the page:

When clicking the "Sync selected ..." button it will synchronize all of the selected packages into the derived series, superseding the package that is currently published. The person clicking the button must have upload privileges for each separate package being synchronized.

The "Upgrade packages" button will take all of the upstream packages that are a higher version than the derived package, and where there's no derived series-only changes, and sync them. Because this can potentially copy hundreds of packages in one go, it requires Archive Admin privileges (the "queueadmin" permission in ArchivePermission).

Both of these methods of synchronizing will observe the normal distribution upload rules and cause uploads to be held in the NEW and UPAPPROVED queues as appropriate (see the /distribution/series/+queue page).

How Synchronizing works

Pressing the buttons creates a PackageCopyJob (PCJ) for each package being copied. The job runner examines each PCJ and checks the policy (ICopyPolicy from lp/soyuz/adapters/ associated with it. The policy decides whether the package is automatically accepted into the distro or is held in one of the queues for Archive Admin checks.

If the package is accepted, it gains the component and section override from the existing ancestry in the distroseries and is published immediately.

If the package is held, the default overrides are applied to the job's metadata, a PackageUpload with a FK reference to PackageCopyJob is created, and the PCJ is suspended.

If the archive admin rejects the package, the job is immediately failed. If the archive admin accepts it, then the job is released from suspension and the job runner will pick it up next time it runs. Because the job will have an associated PackageUpload at this point, it knows that the admin must have accepted the package and immediately publishes the package.

When a package is held in the queues, the archive admin may choose to override the component or section. In this case the override is embedded in the Job's metadata and the runner picks it up and applies it when creating the publication.

Other differences pages

    2686 packages only in Sid (+missingpackages)
    2031 packages only in Oneiric (+uniquepackages)

These two pages are very similar but have reduced functionality as some of the above operations doesn't apply. The +missingpackages page allows you to sync to the derived series, but the +uniquepackages doesn't allow you to sync "upwards" as it doesn't always know the upstream's upload policies.

Soyuz/DerivativeDistributions (last edited 2011-06-14 10:31:48 by matthew.revell)