Foundations/Memcached

(Raw text dump from email from Stuart Bishop)

You may be aware that we have added cache support to TALES to allow us to easily cache rendered portions of page templates in memcached.

We have been running this on edge for a while with caching on two pages - https://edge.launchpad.net and https://code.edge.launchpad.net.

For actual measurements, I had been getting odd results. This has been tracked down to one of the edge servers being unable to access the memcached servers. I've opened RT #38309 to get this sorted. This had the side effect of demonstrating that things continue running uninterrupted when the memcached servers are down.

The overhead of storing chunks in memcached seems insignificant. The performance improvements can be dramatic. With a populated cache, the code front page is generated several seconds faster. When all chunks 'hit', the code front page is generated in under a second. With all misses, the code front page takes a minimum of 5 seconds to generate. There is still a fair bit of variation in generation time though. I think a better approach will be to measure statement counts. With an empty cache, the code front page issues over 70 database queries. With a populated cache, it issues 15.

I think we are ready to use this to cache chunks of content outside of tal:repeat loops. Inside tal:repeat loops, an extension to the syntax needs to be made. At the moment we can say 'this is the first comment' but not 'this is comment #42'. This fails when a new comment is added, and comment #42 becomes the second comment in the repeated section.

The doctest for the current syntax is at http://bazaar.launchpad.net/~launchpad-pqm/launchpad/devel/annotate/head%3A/lib/lp/services/memcache/doc/tales-cache.txt. I believe the loop fix will just involve a third, optional parameter being added so no need to worry about the syntax changing.

One important thing to remember: This feature will not fix timeouts. It might make them less frequent, but it will still take just as long to generate a page with no hits as it does without using the feature. This feature aims to improve the average generation time of pages, improving usability and perceived performance.

Another important thing: Look before you leak. For best performance, you want your cached chunks to be publicly visible. Before you do this, you should ensure no private information could be included in the cached chunk. For example, the code front page has a big public chunk (the tag cloud), but the recent activity sections are private as private branches may be listed there. (As an alternative, consider removing the display of private information in the chunk - I suspect nobody would care if we stopped listing private branches on the code front page and caching and performance would be improved).

An exercise for and interested reader: Some information needs to be fairly up to date, but can be expensive to calculate. Subscriber lists for example. How about we cache them for a lengthy period of time, but refresh that information asynchronously on page load? Perceived performance is improved, bots still get their links to crawl, and interactive users still see up to date information.

We would like to get this information upstream into the Z3 code base, as the new syntax could be added there much more cleanly and with fewer expletives. There are a few LPisms that will need to be abstracted, such as using the tree revision number and $LPCONFIG when calculating cache keys.

launchpad development