Branches and Revisions
The links between branches and revisions are currently (Sep 2010) handled using the BranchRevision table. There is one row in this table for every revision in the branch. This is mildly insane for the number of feature branches that we encourage projects to use as the vast majority of the revisions are common to the branches.
Consider the Launchpad project itself. There are over 90k revisions in the ancestry, so every branch adds 90k rows to the BranchRevision table.
Before we can simplify, reduce and clean up this relationship, we need to understand what the entries are used for.
Uses for the BranchRevision table.
- Branch page
- shows the recent commits for any given branch, up to 10, and includes those already in trunk
Is that actually wanted, though? I think people might prefer seeing only revisions created since branching. — AaronBentley
I think you are right for most branches. Developers of feature branches are only concerned with those since branching, but there are trunk branches where the information is (slightly) useful. Ideally I'd like this to come from loggerhead. — TimPenhey
- shows the recent commits for any given branch, up to 10, and includes those already in trunk
- Merge proposal page
- unmerged revisions (up to 10 - confusing ui)
- commits since the start of the review
- Finding the most relevant branch for any given revision (primarily used in the revision feeds)
What about just keeping track of which branch introduced the revision? — AaronBentley
This approximation is probably fine. — TimPenhey
- Allocating revision karma
Is the branch relevant here, or just the project/package? — AaronBentley
Just the project/package that the branch is connected to. — TimPenhey
- Merge detection in the scanner
- Is the tip of this branch in the ancestry of the development focus branch?
- And if scanning a series linked branch, is the tip of any unmerged branches of the same target present in my ancestry?
- I feel that this use case is the harder one to solve if we keep a limited ancestry.
Meta: For revision feeds and karma, it seems like we're using a list of all branches containing the revision to find a single branch containing the revision-- if we just store the single branch, we can be more efficient.
Possible Solutions
Delta-compress the branch-revision table
This solution is highly compatible with our existing approach. It is a trade-off of performance for space, but with care, the performance reduction may be unobservable. It applies to all use cases.
Use loggerhead
This applies only to display use cases-- Branch revision listings possibly merge proposal revision listings
Scan only for revisions in the current branch that merge the tips of other branches
In the common case, adding a revision to a branch does not enable detecting a merge, because the revision being added to the branch will already be in the ancestry of the merging branch. The exceptions are new branches (which generally should not be set to merged) and ghost-filling. Ghost-filling is believed to be extremely rare.
Store only tip revision info and do multiple DB queries
This models the underlying branches well, but has performance costs. It applies to display use cases.
Store tip revision info, and group revisions by ancestry
Storing groups of, say, 100 revisions according to ancestry would allow retrieving the latest revisions in one or two single database queries and then doing in-memory graph operations. This models the underlying branches well, and applies to display use cases. It could be implemented to provide fast ghost-filling.
Store most-relevant branch on Revision
Since there is only one most-relevant branch, we do not need the one-to-many relationship that BranchRevision provides. However, if the most relevant branch is deleted, we would either need to accept a NULL field or find a new most-relevant branch. If we allow the field to become NULL, we can call it "introducing branch" rather than "most-relevant" branch. This supports allocating revision karma and revision feeds.
Store introducing project/package on Revision
This supports the Revision Karma use case. It is not affected by branch deletion, but will not track branch moves. It is subject to project/package deletion.
Associate merge proposals with Revisions
This supports the use case of displaying unmerged revisions.
Store the last 10 revisions for a branch
This supports the use case of displaying branch revisions.