Foundations/Webservice/Performance

Not logged in - Log In / Register

Revision 2 as of 2010-03-25 20:32:32

Clear message

This page tracks the work to discover and implement the best performance improvements to the Launchpad web service.

First step: quantify performance

We want to be able to measure our performance. Ideally, this would be both end-to-end and subdivided into our network performance, our performance on the client, and our performance on the server. These have four goals.

The last goal means we need to find a balance between thoroughness and expediency in our construction of tests.

Second step: collect, evaluate, winnow, and prioritize possible solutions

We are particularly responsible for the systemic performance of the webservice. This means that we want the average performance to be good. We need to work with the Launchpad team to create good performance within individual requests, but we are more interested here with things that can make the whole webservice faster. Tools that can help developers make individual pages faster easily, but with some effort and customization, are also of interest.

Again, our solutions will focus on different aspects of the end-to-end performance of the webservice. We then have three basic areas to attack.

The following collects brainstormed ideas so far.

Third step: implement the next solution

The next solution is TBD.

Next...

Rinse and repeat back to the first step, trying to determine if our quantifiable performance gives an qualitative experience that we find acceptable.

memcached tests

I hacked lazr.restful to cache completed representations in memcached, and to use them if they were cached. This will not work in a real situation, but it provides an upper bound on how much time we can possibly save by using memcached. I used the https://dev.launchpad.net/Foundations/Webservice?action=AttachFile&do=view&target=performance_test.py throughout.

Entries

Here's the script retrieving an entry 30 times. (I had to disable conditional requests.)

Import cost: 0.44 sec
Startup cost: 1.27 sec
First fetch took 0.18 sec
First five fetches took 0.66 sec (mean: 0.13 sec)
All 30 fetches took 3.13 sec (mean: 0.10 sec)

Import cost: 0.44 sec
Startup cost: 0.84 sec
First fetch took 0.10 sec
First five fetches took 0.50 sec (mean: 0.10 sec)
All 30 fetches took 3.31 sec (mean: 0.11 sec)

I introduce memcached and here are the results:

Import cost: 0.47 sec
Startup cost: 1.27 sec
First fetch took 0.17 sec
First five fetches took 0.58 sec (mean: 0.12 sec)
All 30 fetches took 2.80 sec (mean: 0.09 sec)

Import cost: 0.44 sec
Startup cost: 0.86 sec
First fetch took 0.08 sec
First five fetches took 0.43 sec (mean: 0.09 sec)
All 30 fetches took 2.86 sec (mean: 0.10 sec)

As you can see, there's no significant benefit to caching a single entry representation over not caching it.

Collections

Collections

Here's the script retrieving the first page of a collection 30 times.

Import cost: 1.34 sec
Startup cost: 2.73 sec
First fetch took 0.77 sec
First five fetches took 3.01 sec (mean: 0.60 sec)
All 30 fetches took 18.28 sec (mean: 0.61 sec)

I introduce memcached and here are the results:

Import cost: 0.99 sec
Startup cost: 2.67 sec
First fetch took 0.91 sec
First five fetches took 1.98 sec (mean: 0.40 sec)
All 30 fetches took 5.26 sec (mean: 0.18 sec)

Here there is a very significant benefit to using memcached.

ETags

Then I wanted to see how much benefit would flow from caching entry ETags. I reinstated the conditional GET code and ran another entry test. This time I did 300 fetches.

Import cost: 0.42 sec
Startup cost: 1.22 sec
First fetch took 0.17 sec
First five fetches took 0.62 sec (mean: 0.12 sec)
All 300 fetches took 31.22 sec (mean: 0.10 sec)

Then I added code that would store the calculated ETag in memcached. The result:

Import cost: 0.42 sec
Startup cost: 0.81 sec
First fetch took 0.13 sec
First five fetches took 0.56 sec (mean: 0.11 sec)
All 300 fetches took 32.85 sec (mean: 0.11 sec)

Again, there was no significant difference on the level of individual entries.

== Conclusion ==

If we're going to get benefits from using memcached it will have to be small benefits multiplied across the large number of entries found in a collection.