Diff for "Foundations/Webservice"

Not logged in - Log In / Register

Differences between revisions 10 and 11
Revision 10 as of 2010-03-24 20:46:07
Size: 7378
Editor: gary
Comment:
Revision 11 as of 2010-03-25 18:21:36
Size: 7528
Editor: gary
Comment:
Deletions are marked like this. Additions are marked like this.
Line 73: Line 73:
 * [Client] Faster import of launchpadlib (what is the problem?)
 * [Client] Can we provided cache control for wadl file? Maybe cache for one day?

Webservice

The Foundations team is responsible both for the infrastructure and for the overall quality of service of the Launchpad webservice.

Infrastructure

XXX: launchpadlib (lazr.restful, lazr.restfulclient, wadllib)

QA

We need to QA some of the key launchpadlib applications before a release. Here are the instructions we have gathered for this so far.

apport

Open Bugs

Webservice Bugs

We use the "api" tag. (A short version of the URL is http://tr.im/RAGU)

Infrastructure Bugs:

Versions

The Launchpad webservice is published in versions. We support released applications, such as apport, while continuing to improve our API in backwards incompatible ways. Versions will remain supported until the Ubuntu release that uses it becomes unsupported.

NOTE: We have the mechanism to have separate, frozen versions. We need to actually have tests showing that APIs supporting key applications do not change. We may need to get community help for this.

See this page for the available versions: https://edge.launchpad.net/+apidoc/

Performance improvements

An upcoming initiative is to address one of our most heard complaints: slow performance of the webservice.

What we have now are preliminary notes. This section probably will deserve its own page soon.

First step: quantify performance

We want to be able to measure our performance. Ideally, this would be both end-to-end and subdivided into our network performance, our performance on the client, and our performance on the server. These have four goals.

  • Help us more accurately guess the potential effectiveness of a given solution, to help us winnow and prioritize the list.
  • Help us evaluate the effectiveness of a given solution after a full or partial implementation, to validate our efforts.
  • Help us determine what quantifiable performance level gives our users a qualitatively positive experience.
  • Help us move quickly.

The last goal means we need to find a balance between thoroughness and expediency in our construction of tests.

Second step: collect, evaluate, winnow, and prioritize possible solutions

We are particularly responsible for the systemic performance of the webservice. This means that we want the average performance to be good. We need to work with the Launchpad team to create good performance within individual requests, but we are more interested here with things that can make the whole webservice faster. Tools that can help developers make individual pages faster easily, but with some effort and customization, are also of interest.

Again, our solutions will focus on different aspects of the end-to-end performance of the webservice. We then have three basic areas to attack.

  • Reduce and speed network requests.
  • Make the launchpadlib requests faster systemically on the server.
  • Make the launchpadlib client faster.

The following collects brainstormed ideas so far.

  • [Network] Switch many requests to HTTP, to avoid SSL handshake costs
  • [Network] Investigate whether HTTP 1.1 KeepAlive is enabled, and if not, how it can be

  • [Server] Incorporate memcached and/or noSQL db like MongoDB
    • memcached story: memcached would cache object pre-renderings. These include etag information and all information pre-security-screening that lazr.restful would send over the wire. lazr.restful would get the pre-renderings and turn them into security-screened versions, and send them over the wire. (Questions: will the code to do the security screening need to touch the Storm object so much that it would negate most speed increases? Can we cache the security questions in a way that makes this faster, if so?) When the database performs an update, it would use triggers to invalidate the associated stored memcached values, if any. (Questions: how badly will this affect our write performance? Can we mitigate?)
    • mongodb story: the point of this story is to support keeping questions about collections from hitting postgres. That is much more expensive than just getting the values for a single row. If we can get the collections very fast from a noSQL db, that might be a big win. It would also support getting "nested" requests (see idea below) quickly. The proposed implementation is similar to the memcached story, except that triggers in postgres would completely maintain the pre-rendered data in the persistent noSQL db, rather than invalidating cached data. We would then use indexes in mongoDB to get the nested collections back.
  • [Network] Support "nested" requests: e.g., get a person and the first 50 of her bugs in a single request.
  • [Client] Profile client and examine
  • [Server] Profile server code and examine
  • [Server] Add zc.zservertracelog notes for when lazr.restful code starts, when it passes off to launchpad code, when it gets the result from launchpad, and when it hands the result back to the publisher. First and last values may be unnecessary--equivalent to already-existing values.
  • [Client] Faster import of launchpadlib (what is the problem?)
  • [Client] Can we provided cache control for wadl file? Maybe cache for one day?

Third step: implement the next solution

The next solution is TBD.

Next...

Rinse and repeat back to the first step, trying to determine if our quantifiable performance gives an qualitative experience that we find acceptable.

Usability improvements

After working on performance, we want to focus on finding some key usability experience improvements.

XXX list pertinent bugs and other thoughts. Here's a start.

534363 no easy way to call destructor

274074 Missing total_size on collections returned by named operations

481090 Cannot define a method that returns a dict

487522 named_get's do not support custom batches via slice

539070 Unhelpful error on badly declared API export

541637 Mutated items in launchpadlib search results iterator can break iteration

380504 Handle 502 Bad Gateway error automatically

Foundations/Webservice (last edited 2010-11-02 00:52:18 by gary)