Diff for "Foundations/Memcached"

Not logged in - Log In / Register

Differences between revisions 1 and 2
Revision 1 as of 2010-03-26 19:37:57
Size: 3515
Editor: gary
Comment:
Revision 2 as of 2010-06-22 17:19:19
Size: 3292
Editor: gary
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
(Raw text dump from email from Stuart Bishop) = Memcached integration =
Line 3: Line 3:
You may be aware that we have added cache support to TALES to allow us to easily cache rendered portions of page templates in memcached. == TALES support ==
Line 5: Line 5:
We have been running this on edge for a while with caching on two pages - https://edge.launchpad.net and https://code.edge.launchpad.net. We have added cache support to TALES to allow us to easily cache rendered portions of page templates in memcached.
Line 7: Line 7:
For actual measurements, I had been getting odd results. This has been tracked down to one of the edge servers being unable to access the memcached servers. I've opened RT #38309 to get this sorted. This had the side effect of demonstrating that things continue running uninterrupted when the memcached servers are down.

The overhead of storing chunks in memcached seems insignificant. The performance improvements can be dramatic. With a populated cache, the code front page is generated several seconds faster. When all chunks 'hit', the code front page is generated in under a second. With all misses, the code front page takes a minimum of 5 seconds to generate. There is still a fair bit of variation in generation time though. I think a better approach will be to measure statement counts. With an empty cache, the code front page issues over 70 database queries. With a populated cache, it issues 15.

I think we are ready to use this to cache chunks of content outside of tal:repeat loops. Inside tal:repeat loops, an extension to the syntax needs to be made. At the moment we can say 'this is the first comment' but not 'this is comment #42'. This fails when a new comment is added, and comment #42 becomes the second comment in the repeated section.

The doctest for the current syntax is at http://bazaar.launchpad.net/~launchpad-pqm/launchpad/devel/annotate/head%3A/lib/lp/services/memcache/doc/tales-cache.txt. I believe the loop fix will just involve a third, optional parameter being added so no need to worry about the syntax changing.
The doctest for the current syntax is at http://bazaar.launchpad.net/~launchpad-pqm/launchpad/devel/annotate/head%3A/lib/lp/services/memcache/doc/tales-cache.txt.
Line 19: Line 13:
An exercise for and interested reader: Some information needs to be fairly up to date, but can be expensive to calculate. Subscriber lists for example. How about we cache them for a lengthy period of time, but refresh that information asynchronously on page load? Perceived performance is improved, bots still get their links to crawl, and interactive users still see up to date information. === Performance notes ===

The overhead of storing chunks in memcached seems insignificant. The performance improvements can be dramatic. With a populated cache, the code front page is generated several seconds faster. When all chunks 'hit', the code front page is generated in under a second. With all misses, the code front page takes a minimum of 5 seconds to generate. There is still a fair bit of variation in generation time though. I think a better approach will be to measure statement counts. With an empty cache, the code front page issues over 70 database queries. With a populated cache, it issues 15.

=== To do ===
Line 22: Line 20:

=== Ideas for further refinement ===

==== Refresh asynchronously on page load ====

(Stuart) Some information needs to be fairly up to date, but can be expensive to calculate--subscriber lists, for example. How about we cache them for a lengthy period of time, but refresh that information asynchronously on page load? Perceived performance is improved, bots still get their links to crawl, and interactive users still see up to date information.

Gary: Maybe. Do you have ideas on how we might perform the asynchronous jobs? I'd prefer to have an invalidation story, if that can work.

== Webservice integration ==

This was investigated, attempted and shelved. See [[Foundations/Webservice/RepresentationCache|the notes]].

== Invalidating cached data ==

See [[Foundations/Proposals/InvalidatingPageSnippets|Proposal].

== Cache Team Participation Checks ==

We might be able to cache team participation checks in the security system. Stuart intends to try and construct a cheap test to see if this has value. (XXX more description or a proposal would be nice.)

Memcached integration

TALES support

We have added cache support to TALES to allow us to easily cache rendered portions of page templates in memcached.

The doctest for the current syntax is at http://bazaar.launchpad.net/~launchpad-pqm/launchpad/devel/annotate/head%3A/lib/lp/services/memcache/doc/tales-cache.txt.

One important thing to remember: This feature will not fix timeouts. It might make them less frequent, but it will still take just as long to generate a page with no hits as it does without using the feature. This feature aims to improve the average generation time of pages, improving usability and perceived performance.

Another important thing: Look before you leak. For best performance, you want your cached chunks to be publicly visible. Before you do this, you should ensure no private information could be included in the cached chunk. For example, the code front page has a big public chunk (the tag cloud), but the recent activity sections are private as private branches may be listed there. (As an alternative, consider removing the display of private information in the chunk - I suspect nobody would care if we stopped listing private branches on the code front page and caching and performance would be improved).

Performance notes

The overhead of storing chunks in memcached seems insignificant. The performance improvements can be dramatic. With a populated cache, the code front page is generated several seconds faster. When all chunks 'hit', the code front page is generated in under a second. With all misses, the code front page takes a minimum of 5 seconds to generate. There is still a fair bit of variation in generation time though. I think a better approach will be to measure statement counts. With an empty cache, the code front page issues over 70 database queries. With a populated cache, it issues 15.

To do

We would like to get this information upstream into the Z3 code base, as the new syntax could be added there much more cleanly and with fewer expletives. There are a few LPisms that will need to be abstracted, such as using the tree revision number and $LPCONFIG when calculating cache keys.

Ideas for further refinement

Refresh asynchronously on page load

(Stuart) Some information needs to be fairly up to date, but can be expensive to calculate--subscriber lists, for example. How about we cache them for a lengthy period of time, but refresh that information asynchronously on page load? Perceived performance is improved, bots still get their links to crawl, and interactive users still see up to date information.

Gary: Maybe. Do you have ideas on how we might perform the asynchronous jobs? I'd prefer to have an invalidation story, if that can work.

Webservice integration

This was investigated, attempted and shelved. See the notes.

Invalidating cached data

See [[Foundations/Proposals/InvalidatingPageSnippets|Proposal].

Cache Team Participation Checks

We might be able to cache team participation checks in the security system. Stuart intends to try and construct a cheap test to see if this has value. (XXX more description or a proposal would be nice.)

Foundations/Memcached (last edited 2010-06-22 17:20:16 by gary)