= Collections =
Collections are a way of selecting a group of object based on some criteria, and then
either just getting the objects, or possibly manipulating the set as one.
They have two types of methods on them, firstly restrict methods, which reduce the set
of objects according to some criteria, and secondly manipulation methods, which manipulate
each of the objects in the current set.
== Examples ==
* '''`BranchCollection`''' - '''lp.code.{model,interfaces}.branchcollection'''
* '''`TranslationTemplatesCollection`''' - '''lp.translations.{model,interfaces}.potemplate'''
* '''`ArchiveCollection`''' - '''lp.soyuz.{model,interfaces}.archivecollection'''
=== Example use ===
{{{
all_branches = getUtility(IAllBranches)
my_branches = all_branches.ownedBy(me)
branches_i_can_see = all_branches.visibleByUser(me)
merge_proposals_on_my_branches = my_branches.getMergeProposals()
my_branch_objects = my_branches.getBranches()
}}}
== Creating a collection ==
In [[https://git.launchpad.net/launchpad/tree/lib/lp/services/database/collection.py|lp.services.database.collection]] there is a base class you can use for creating
your own collection.
In '''lp.app.model.foocollection''' add
{{{
from lp.app.model.foo import Foo
from lp.services.database.collection import Collection
class FooCollection(Collection):
"""A collection of `Foo`."""
starting_table = Foo
}}}
Which is the basic collection.
You can then add methods to it such as
{{{
def ownedBy(self, owner):
return self.refine(Foo.owner == owner)
}}}
with appropriate tests (see `lp.soyuz.tests.test_archivecollection` for some inspiration).
Once you have an object with the methods that will be useful to you, you need
to add an interface and a utility.
In '''lp.app.interfaces.foocollection''' add the following:
{{{
from zope.interface import Interface
class IFooCollection(Interface):
"""Collection of `Foo`."""
def select(*args):
"""See `Collection`."""
}}}
with the methods you want on the interface.
The `select` method, or something like it, has to be there, since it's how you retrieve a Storm `ResultSet` with the objects and/or columns you want from the collection. Instead of a select method, you might wish to have multiple methods to get different kinds of objects. For example, `IBranchCollection` has `getBranches()` and `getMergeProposals()`, where the latter returns all merge proposals associated with the collection of branches.
Once you have that, add a marker interface for getting a utility to get all `Foo`s
{{{
class IAllFoos(IFooCollection):
"""Get all foos."""
}}}
You can add other marker interfaces here if you wish to provide other entry points,
for instance if it is very common to be interested in all foos of a particular type
or status.
Next comes the ZCML:
{{{
}}}
Which will mean that you can '''getUtility(IAllFoo)''' to start working with a
collection.
== Using the collection ==
{{{
all_foos = getUtility(IAllFoo)
foos = all_foos.ownedBy(person).withStatus(status).select()
}}}
The arguments to select are the same as the first argument to `Store.find()`.
== Adding adapters ==
It is possible to add adapters for objects of interest to get a collection
initialized as appropriate.
For instance you could add adapters such that:
{{{
IFooCollection(product)
}}}
returned you a `FooCollection` for all the Foos associated with that product.
To do so, define a function that takes the original object and returns an `IFooCollection`, e.g.:
{{{
def product_to_foo_collection(product):
return getUtility(IAllFoos).inProduct(product)
}}}
And then add something like this to the relevant ZCML:
{{{
}}}
== Adding Joins ==
Two `Collection` methods help you join other tables into a collection: `joinInner` which creates a run-of-the-mill inner join and `joinOuter` which adds in the new table using an outer (or "left") join. They both work like:
{{{
joined_collection = base_collection.joinInner(Person, Person.id == Foo.owner)
joined_collection = base_collection.joinOuter(Person, Person.id == Foo.owner)
}}}
(Of course the "outer" case means that the `Person` will be `None` if there is no `Person.id` matching `Foo.owner`. The "inner" case will just filter out `Foo` items that don't have an owner.)
== Custom Selects ==
The `select` method returns a `ResultSet` of `Foo` by default:
{{{
num_foos = all_foos.select().count()
print "There are %d foo(s)." % num_foos
if num_foos > 0:
print "The oldest foo is %s." % all_foos.select().order_by(Foo.id)[0]
}}}
However you can select any combination of columns and objects that are in the query. The default is to select `Foo` objects, but you can ask for more (or different) data when you invoke `select`. Each `select` will create a new `ResultSet` so each will be executed separately.
{{{
foos_and_owners = all_foos.innerJoin(Person, Person.id == Foo.owner)
for foo, owner_name in foos_and_owners.select(Foo, Person.name):
print "Foo #%d is owned by %s." % (foo.id, owner_name)
}}}
== Optimization ==
As you know most Foos are publicly accessible, but a few are private. Finding private Foo objects that are visible to the current user is expensive:
{{{
def visibleTo(self, user):
"""Restrict to `Foo`s that `user` can see."""
Owner = ClassAlias(Person)
with_owner = self.joinOuter(Owner, Owner.id == Foo.owner)
with_user = with_owner.joinOuter(
TeamParticipation,
TeamParticipation.team_id == Owner.id)
return with_user.refine(
Or(
# Return Foos that are public, or are owned by
# "user," or are owned by teams that "user" is in.
Foo.is_private == False,
TeamParticipation.person_id == user.id))
}}}
This is a big "performance pattern" in Launchpad. There are really two queries in here: a narrow-but-deep one that only looks at public `Foo` and gets of results, and a wide-but-shallow one that needs to join in other tables and check further details for the few private `Foo`.
You can speed this up treating these two as separate collections. Since each refinement on a collection creates a new one and leaves the old one intact, it's easy to re-use the common parts between both:
{{{
interesting_foos = all_foos.refineOneWay().refineAnotherWay()
public_foos = interesting_foos.isPublic(True)
private_foos = interesting_foos.isPublic(False).visibleTo(user)
return Union(public_foos.select(), private_foos.select())
}}}
(Of course this also leaves a lot of dead wood in `visibleTo` that you can cut to make it faster: you no longer need the `Or` and the joins can become inner joins).
One example of this is in [[https://git.launchpad.net/launchpad/tree/lib/lp/code/model/branchcollection.py|lp.code.model.branchcollection]]. Look for the `visibleByUser` method.