LEP/BetterPrivacy/ImplementationNotes

Not logged in - Log In / Register

ACL Implementation Notes

This document contains various notes about the implemenation of the ACL system, from discussions between jml, flacoste, and bjornt, as well as follow-up discussions on the launchpad-dev mailing list.

For an overview of how the current ACL system works, see [http://bazaar.launchpad.net/~bjornt/launchpad/privacy-spike/annotate/head:/lib/lp/services/doc/acl.txt|acl.txt]] and test_acl.py in lp:~bjornt/launchpad/privacy-spike.

We didn't have the UI specced out yet, but instead of blocking on that, we went ahead and designed the API anyway, allowing it to be flexible, so that most use cases could be fulfilled.

We wanted to create an ACL system that tracked who has certain permissions to do various things to an object. The callsites that create objects are responsible for populating the ACLs for the created object. We would add a base security adapter, that would query the ACLs to check whether someone has the right permissions. In the first iteration, this could be on top of the existing security adapters.

Permisssions

To make things easier to grasp, we decided to focus only on the read aspect of things. So in the first iteration we want to use ACLs to control who can see an object, but we won't use it to control who can modify an object. We thought it would be easier to have just one (READ) permission to begin with, although we also had to add MODIFY_ACL, to control who can give people permission to see the objects.

An open question is how to model the bug supervisor using the ACL system

Inheritance

ACLs need to be inherited by child objects. For example, a project's bugs should automatically get the same ACLs that the project has. If the project's ACLs changes, so should the bugs' ACLs do.

It's get a bit trickier if someone gets access to a single bug within a project. Now that bug's ACL aren't inherited anymore, so if the project's ACL changes, it's not clear whether that bug's ACL should changes as well. We don't propagate changes to such overridden ACLs by default, but we'll provide a list of overridden ACLs, so that the UI can choose what do to. It could let the user choose which ones to propagate the change to, or simply propagate (or not) without asking questions.

Oone open question is to deal with a private project that wants someone to be able to see a specific bug within the project. That person will have to have permission to at least the project name, so that he can traverse to the bug. It's unclear how much of the project information that can be exposed, and how to do it.

Model

Each object needs to have at least one ACL entry. We consider having either one big ACL table for all object types (e.g. ObjectACL), having the object type as an enum, or having one ACL table for each object type (e.g. ProductACL, BugTaskACL, BranchACL, etc.). We believed that the latter would be better from a complexity and performance point of view, but we didn't confirm it by doing actual testing.

Inheritance

As for inheritance we considered either copying an object's ACLs to its child objects, or using recursive queries to get the parent ACLs. Copying requires more storage space, but should lead to faster reads, which is important. It also reduce the query complexity. It does lead to slower writes, but that should be ok, since we do way more reads than writes.

Permissions by App

There's a need to have permissions per app (e.g. Bugs, Code). For example, branches might be private, while bugs aren't. To deal with this, a fake object is created, e.g. ProductBugs, which will have ACLs associated with it. BugTaskACLs will specifiy having the ProductBugs object as parent, and ProductBugs will have the Product as parent.

Collections and Searches

A tricky part is to make sure our SQL queries take ACL information into account when searching for objects that can be private. Standardising on a model similar to the existing BranchCollection would be good. In order to get such a collection you should always specify as what user you want to do the searches as. So you should always have to call something like getCollectionForUser() to get the main collection. For scripts, we would either have to pass a special parameter, or have another method.

The BaseCollection class could automatically inject the SQL for the ACLs.

Performance

It's important that the existing SQL queries in Launchpad don't get much slower by including ACL information. Local testing showed that in order to keep the common case, having a public object, fast, it's worth keeping a private attribute that can be updated using triggers on the ACL tables. Not having to join in the ACL tables are a big win. For queries involving private objects, the proposed model is a bit similar to what we have today, so shouldn't slow things down much.

In local testing with searching for bug tasks, using ACLs were a bit slower than using bug subscriptions. It should be possible to tweak this when we test on real data, though, since in theory, it shouldn't be slower.

Raw Notes from Local Testing Session

Tested having 50 000 public bugs for a product with random statuses.
Selecting the first 40 (as anonymous) went from around 1100 ms to 1300
ms. At a later test, after a fresh db import, it went from 520 ms to
660ms

Worth keeping the private attribute, as a cache. querying for private =
False, is much faster than ACL.person is NULL. The private attribute can
be kept up-to-date using triggers on the BugTaskACL table.

Testing having 10 000 of the 50 000 bugs private, resulted in the
existing query taking around 50 seconds. Not sure why. Adding an index
on BugSubscription (bug, person) took it down to 700 ms. Using ACLs took
around 850 ms.

Currently, Launchpad has the most number of branches for a product, at
around 5 500 branches. No real testing was done, but looking at the
current queries used for branches, switching to ACL should be just as
fast, maybe even faster, since the queries will be less complex.

One thought was to use recursive queries, instead of copying ACL for
inheritance. I didn't pursue that, since the query would become too
complex, it would involve knowledge about all objects and acl tables
used.

LEP/BetterPrivacy/ImplementationNotes (last edited 2010-06-11 15:05:56 by bjornt)