This page tracks the work to discover and implement performance improvements to the Launchpad web service. To see how to improve the performance of a launchpadlib-based client, see Client-Side Web Service Performance

Solutions implemented

Request service root conditionally

Due to a bug in httplib2, launchpadlib was never making conditional requests for the service root even though the lazr.restfulclient tests worked. We changed the headers Launchpad serves and the problem went away.

Benefit: launchpadlib now downloads WADL only in very rare cases (when we upgrade Launchpad). Benefit accrues to existing launchpadlib installations.

In a live test, this reduced startup time from 3.9 seconds to 0.8 seconds.

Cache the service root client-side

We changed lazr.restful to serve Cache-Control headers along with the service root (WADL and JSON). For frozen versions of the web service (beta and 1.0) the Cache-Control max-age is one week; for devel it's one hour. We can tweak this further in the future.

Benefit: launchpadlib now makes HTTP requests on startup only once a week (or hour). Due to a bug in httplib2, benefit only accrues to installations with an up-to-date lazr.restfulclient.

Remove lazr.restfulclient's dependency on lazr.restful

This wasn't done for performance reasons, but it seems to be what brought launchpadlib import time from 0.36 seconds to 0.20 seconds due to time saved in pkg_resources.

Fix bug 568301

Named operations on collections were fetching the collections before invoking the operation.

Benefit: In a test against edge, it took 13-19 seconds to invoke getByUniqueName() on /branches without the fix in place. With the fix in place, it took about 1.3 seconds.

Reinstate mod_compress on the 'api' vhost'

This reduces bandwidth usage by about 90%, especially useful when serving WADL documents and large collections. mod_compress was disabled in 2008 due to a bug which has since been fixed. We had a workaround in place, but it never worked.

As of May 21, mod_compress has been re-enabled on staging and edge. Benefit accrues only to clients using lazr.restfulclient 0.9.17 or above.

Benefit: A simple test was run on edge with a completely empty cache: get the WADL file and the first fifty members of the 'people' collection. With an old client, a total of 1.3 megabytes was served and the test took 16.2 seconds to run. With a new client, a total of 130 kilobytes was served and the test took 9.5 seconds to run.

Fix bug 583318

A script that looked up an object from a top-level collection (like launchpad.projects['firefox']) and then invoked a named operation on it was sending out two HTTP requests: one for the object itself, and another for the named operation. I was able to eliminate that first HTTP request for all top-level collections.

Benefit: Many scripts will simply not make one or more HTTP requests they would have otherwise made. Benefit usually accrues on startup, and only with new versions of launchpadlib and lazr.restfulclient.

Worthwhile but not implemented

Speed up attribute access

When you access an attribute like person.name, it's handled through getattr(). We could use getattr() the first time and then store the value in person.name.

I ran a test comparing C attribute access x 40,000 to lazr.restfulclient attribute access x 40,000. C attribute access took 0.01 seconds. Going through getattr() took 123 seconds. So getattr() is 12,300 times slower. Assuming we can kind of add this optimization in passing without adding a lot of complexity, it's a worthwhile improvement, but it won't speed up typical scripts.

Not worthwhile/too much work

Store representations in memcached

We went a long way down this path but ultimately decided it wasn't worth the effort given current usage patterns. For more details, see The Representation Cache

Speed up launchpadlib startup time

This is dominated by pkg_resources setup, so there's not that much we can do. We did improve this a bit by accident (see above).

Speed up wadllib parse time

I ran an experiment to see whether it would be faster to load the wadllib Application object from a pickle rather than parsing it every time. To get pickle to work I had to use elementtree.ElementTree (pure Python) instead of cElementtree. This made the initial parse go from about .3 seconds to about 3 seconds. Unpickling the pickle took about .63 seconds, twice the time it took to just parse the XML. It doesn't seem worth it. (Though I don't really see how it can be faster to create the Application from XML than from a pickle--maybe cElementtree is just really really fast.)

Avoid database access when building entry representations

While implementing "Store representations in memcached" (q.v.) we discovered that building a representation of a Launchpad bug required 6 database queries. We investigated the possibility of pre-caching these queries and making "build a representation" much faster as an alternative to the representation cache. The problem is that these queries are initiated by Python logic defined in @property methods, making it difficult to consolidate the different database queries. We might be able to get this down to 2 or 3, but 2 or 3 database queries is still pretty slow, so the representation cache is still useful.

Cache collections in a noSQL database

Like MongoDB. The point of this story is to support keeping questions about collections from hitting postgres. That is much more expensive than just getting the values for a single row. If we can get the collections very fast from a noSQL db, that might be a big win. It would also support getting "nested" requests (see idea below) quickly. The proposed implementation is similar to the memcached story, except that triggers in postgres would completely maintain the pre-rendered data in the persistent noSQL db, rather than invalidating cached data. We would then use indexes in mongoDB to get the nested collections back. (The problem with this is we don't have good rules for collection cache invalidation.)

Use HTTP 1.1 KeepAlive

According to Gary, getting the Launchpad stack to support HTTP 1.1 is too risky: it fails in production under as-yet-unknown circumstances.

No decision yet

Cache entries and/or collections on the client side

When serving a representation of an entry or collection, send a Cache-Control header that specifies max-age. The client will not make subsequent requests for the entry until the max-age expires. Since the client requested this representation once, it's at least somewhat likely to do it again later. How much time this saves depends on what we choose for max-age.

Collections are currently never cached, but I think that's just because we don't serve any information that would let httplib2 make a conditional request or know when the cache would expire.

This is easy to implement, but to benefit we must accept some level of cache staleness on the client side. It has to be okay for a client to spend a while ignorant of some change to an entry, and (for a collection) ignorant of entries' addition to or removal from the collection.

Caching bug comments client-side is an clear win, since they never change, but it's rare for a client to request a specific bug comment, so it's a very small win. Caching the _collection_ of a bug's comments would be a much bigger win, but then clients would go a while without knowing that a new comment was added to a bug, which we don't like.

Because we are so insistent on providing up-to-date data, I believe the scope for this solution is very small. It's possible some of the HWDB resources could use this.

Collection-specific ETags

We have no general way of calculating an ETag or a Last-Modified for a collection, but it might be possible to set up a hook method that calculates these values for specific collections. This would let launchpadlib make conditional requests for collections. _This_ would let us cache a bug's comments on the client side.

Expand links in representations

When you get a list of bugs, each one has a 'bugtask_link' field that links to a bugtask resource. Under this system, you'd be able to GET '/bugs?ws.expand=bugtask_link' and retrieve a representation in which 'bugtask' was a JSON dict containing the representation of a bugtask. If you wanted to look at the bugtask for 50 bugs, this would save you 50 round-trip HTTP requests, at the cost of doing one really expensive HTTP request.

You could expand a collection link as well as an entry link. Get /~leonardr?ws.expand=assigned_bugs_collection_link, and you'll get a representation of an entry in which 'assigned_bugs_collection' is a JSON list of dicts.

Remove <doc> tags from WADL

The gzipped WADL is currently about 100K. If we removed all the <doc> tags it would be 40K. Serving a stripped-down WADL by default would save some startup time. However, if the WADL contains the <doc> tags we can make doc(launchpad.bugs.search) do something useful. If we don't serve any human-readable description, users will have to go to the HTML apidoc. We could make "make doc() work" a constructor option--you could turn it on during development and turn it off when you released your script.

Since Launchpad now serves the full WADL relatively rarely, this isn't as important as it used to be.

Others

[Network] Switch many requests to HTTP, to avoid SSL handshake costs. Since Launchpad is doing this, we should see how much time this would save and how much work it would be to piggyback on Launchpad's success.
Examine actual usage of launchpadlib in popular scripts to find broken abstractions, cost savings through named operations, etc.
[Client] Profile client and examine
[Server] Profile server code and examine. Maybe add zc.zservertracelog notes for when lazr.restful code starts, when it passes off to launchpad code, when it gets the result from launchpad, and when it hands the result back to the publisher. First and last values may be unnecessary--equivalent to already-existing values.
[Client?] Bug 274074 - It's not only more annoying, but slower to get the size of a collection returned by a named operation. This might be a client- or a server-side fix.

Process

First step: quantify performance

We want to be able to measure our performance. Ideally, this would be both end-to-end and subdivided into our network performance, our performance on the client, and our performance on the server. These have four goals.

Help us more accurately guess the potential effectiveness of a given solution, to help us winnow and prioritize the list.
Help us evaluate the effectiveness of a given solution after a full or partial implementation, to validate our efforts.
Help us determine what quantifiable performance level gives our users a qualitatively positive experience.
Help us move quickly.

The last goal means we need to find a balance between thoroughness and expediency in our construction of tests.

https://dev.launchpad.net/Foundations/Webservice?action=AttachFile&do=view&target=performance_test.py

Second step: collect, evaluate, winnow, and prioritize possible solutions

We are particularly responsible for the systemic performance of the webservice. This means that we want the average performance to be good. We need to work with the Launchpad team to create good performance within individual requests, but we are more interested here with things that can make the whole webservice faster. Tools that can help developers make individual pages faster easily, but with some effort and customization, are also of interest.

Again, our solutions will focus on different aspects of the end-to-end performance of the webservice. We then have three basic areas to attack.

Reduce and speed network requests.
Make the launchpadlib requests faster systemically on the server.
Make the launchpadlib client faster.

Third step: implement the next solution

The next solution is TBD.

Next...

Rinse and repeat back to the first step, trying to determine if our quantifiable performance gives an qualitative experience that we find acceptable.

Foundations/Webservice/Performance (last edited 2010-06-25 11:46:00 by leonardr)

launchpad development

Foundations/Webservice/Performance