Diff for "Code/BranchRevisions"

Not logged in - Log In / Register

Differences between revisions 4 and 5
Revision 4 as of 2010-09-17 05:04:18
Size: 2325
Editor: thumper
Comment:
Revision 5 as of 2010-10-15 15:14:52
Size: 4646
Editor: abentley
Comment:
Deletions are marked like this. Additions are marked like this.
Line 31: Line 31:

== Possible Solutions ==
=== Delta-compress the branch-revision table ===
This solution is highly compatible with our existing approach. It is a trade-off of performance for space, but with care, the performance reduction may be unobservable. It applies to all use cases.

=== Use loggerhead ===
This applies only to display use cases-- Branch revision listings possibly merge proposal revision listings

=== Scan only for revisions in the current branch that merge the tips of other branches ===
In the common case, adding a revision to a branch does not enable detecting a merge, because the revision being added to the branch will already be in the ancestry of the merging branch. The exceptions are new branches (which generally should not be set to merged) and ghost-filling. Ghost-filling is believed to be extremely rare.

=== Store only tip revision info and do multiple DB queries ===
This models the underlying branches well, but has performance costs. It applies to display use cases.

=== Store tip revision info, and group revisions by ancestry ===
Storing groups of, say, 100 revisions according to ancestry would allow retrieving the latest revisions in one or two single
database queries and then doing in-memory graph operations. This models the underlying branches well, and applies to
display use cases. It could be implemented to provide fast ghost-filling.

=== Store most-relevant branch on Revision ===
Since there is only one most-relevant branch, we do not need the one-to-many relationship that BranchRevision provides. However, if the most relevant branch is deleted, we would either need to accept a NULL field or find a new most-relevant branch. If we allow the field to become NULL, we can call it "introducing branch" rather than "most-relevant" branch. This supports allocating revision karma and revision feeds.

=== Store introducing project/package on Revision ===
This supports the Revision Karma use case. It is not affected by branch deletion, but will not track branch moves. It is subject to project/package deletion.

=== Associate merge proposals with Revisions ===
This supports the use case of displaying unmerged revisions.

=== Store the last 10 revisions for a branch ===
This supports the use case of displaying branch revisions.

Branches and Revisions

The links between branches and revisions are currently (Sep 2010) handled using the BranchRevision table. There is one row in this table for every revision in the branch. This is mildly insane for the number of feature branches that we encourage projects to use as the vast majority of the revisions are common to the branches.

Consider the Launchpad project itself. There are over 90k revisions in the ancestry, so every branch adds 90k rows to the BranchRevision table.

Before we can simplify, reduce and clean up this relationship, we need to understand what the entries are used for.

Uses for the BranchRevision table.

  • Branch page
    • shows the recent commits for any given branch, up to 10, and includes those already in trunk
      • Is that actually wanted, though? I think people might prefer seeing only revisions created since branching.AaronBentley

      • I think you are right for most branches. Developers of feature branches are only concerned with those since branching, but there are trunk branches where the information is (slightly) useful. Ideally I'd like this to come from loggerhead.TimPenhey

  • Merge proposal page
    • unmerged revisions (up to 10 - confusing ui)
    • commits since the start of the review
  • Finding the most relevant branch for any given revision (primarily used in the revision feeds)
    • What about just keeping track of which branch introduced the revision?AaronBentley

    • This approximation is probably fine.TimPenhey

  • Allocating revision karma
    • Is the branch relevant here, or just the project/package?AaronBentley

    • Just the project/package that the branch is connected to.TimPenhey

  • Merge detection in the scanner
    • Is the tip of this branch in the ancestry of the development focus branch?
    • And if scanning a series linked branch, is the tip of any unmerged branches of the same target present in my ancestry?
    • I feel that this use case is the harder one to solve if we keep a limited ancestry.

Meta: For revision feeds and karma, it seems like we're using a list of all branches containing the revision to find a single branch containing the revision-- if we just store the single branch, we can be more efficient.

Possible Solutions

Delta-compress the branch-revision table

This solution is highly compatible with our existing approach. It is a trade-off of performance for space, but with care, the performance reduction may be unobservable. It applies to all use cases.

Use loggerhead

This applies only to display use cases-- Branch revision listings possibly merge proposal revision listings

Scan only for revisions in the current branch that merge the tips of other branches

In the common case, adding a revision to a branch does not enable detecting a merge, because the revision being added to the branch will already be in the ancestry of the merging branch. The exceptions are new branches (which generally should not be set to merged) and ghost-filling. Ghost-filling is believed to be extremely rare.

Store only tip revision info and do multiple DB queries

This models the underlying branches well, but has performance costs. It applies to display use cases.

Store tip revision info, and group revisions by ancestry

Storing groups of, say, 100 revisions according to ancestry would allow retrieving the latest revisions in one or two single database queries and then doing in-memory graph operations. This models the underlying branches well, and applies to display use cases. It could be implemented to provide fast ghost-filling.

Store most-relevant branch on Revision

Since there is only one most-relevant branch, we do not need the one-to-many relationship that BranchRevision provides. However, if the most relevant branch is deleted, we would either need to accept a NULL field or find a new most-relevant branch. If we allow the field to become NULL, we can call it "introducing branch" rather than "most-relevant" branch. This supports allocating revision karma and revision feeds.

Store introducing project/package on Revision

This supports the Revision Karma use case. It is not affected by branch deletion, but will not track branch moves. It is subject to project/package deletion.

Associate merge proposals with Revisions

This supports the use case of displaying unmerged revisions.

Store the last 10 revisions for a branch

This supports the use case of displaying branch revisions.

Code/BranchRevisions (last edited 2010-10-26 14:45:32 by abentley)