Diff for "Database/Performance"

Not logged in - Log In / Register

Differences between revisions 1 and 2
Revision 1 as of 2010-08-11 23:55:32
Size: 24
Editor: lifeless
Comment: start!
Revision 2 as of 2010-08-12 01:50:52
Size: 2117
Editor: lifeless
Comment: start
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:

= Poor query times - looks right, takes ages =

Poor query times can be caused by bad query plans generated on the DB - talk to a team lead to get an explain analyse done on staging, or to Stuart or a LOSA to get the same done on production (if the staging one looks ok its important to check production too).

Bad plans can happen due to out of date statistics or corner cases, sometimes rewriting the query to be simpler/slightly different can help. Specific things to watch out for are nested joins and unneeded tables (which can dramatically change the lookup).

Poor query times can also be caused by fat indices - if the explain looks sensible, talk to Stuart about this. Another cause is missing indices - check that the query should be indexable (and if in doubt chat to Stuart).

= Many small queries [in a webapp context]=

Databases work best when used to work with sets of data, not objects - but we write in python, which is procedural and we define per-object code paths.

One particular trip up that can occur is with related and derived data.

Consider:
{{{
def get_finished_foos(self):
    return Store.of(self).find(Foo, And(Foo.me == self.id, Foo.finished == True))
}}}

This will perform great for one object, but if you use it in a loop going over even as few as 30 or 40 objects you will cause a large amount of work - 30 to 40 separate round trips to the database.

Its much better to prepopulate a cache of these finished_foos when you request the owning object in the first place, when you know that you will need them.

To do this, use a Tuple Query with Storm, and assign the related objects to a cached attribute which your method can return. For attributes the {{{@cachedproperty('_foo_cached')}}} can be used to do this in combination with a {{{DecoratedResultSet}}}

Be sure to clear these caches with a Storm invalidation hook, to avoid test suite fallout. Objects are not reused between requests on the appservers, so we're generally safe there.

A word of warning too - Utilities will often get in the way of optimising this :)

Poor query times - looks right, takes ages

Poor query times can be caused by bad query plans generated on the DB - talk to a team lead to get an explain analyse done on staging, or to Stuart or a LOSA to get the same done on production (if the staging one looks ok its important to check production too).

Bad plans can happen due to out of date statistics or corner cases, sometimes rewriting the query to be simpler/slightly different can help. Specific things to watch out for are nested joins and unneeded tables (which can dramatically change the lookup).

Poor query times can also be caused by fat indices - if the explain looks sensible, talk to Stuart about this. Another cause is missing indices - check that the query should be indexable (and if in doubt chat to Stuart).

= Many small queries [in a webapp context]=

Databases work best when used to work with sets of data, not objects - but we write in python, which is procedural and we define per-object code paths.

One particular trip up that can occur is with related and derived data.

Consider:

def get_finished_foos(self):
    return Store.of(self).find(Foo, And(Foo.me == self.id, Foo.finished == True))

This will perform great for one object, but if you use it in a loop going over even as few as 30 or 40 objects you will cause a large amount of work - 30 to 40 separate round trips to the database.

Its much better to prepopulate a cache of these finished_foos when you request the owning object in the first place, when you know that you will need them.

To do this, use a Tuple Query with Storm, and assign the related objects to a cached attribute which your method can return. For attributes the @cachedproperty('_foo_cached') can be used to do this in combination with a DecoratedResultSet

Be sure to clear these caches with a Storm invalidation hook, to avoid test suite fallout. Objects are not reused between requests on the appservers, so we're generally safe there.

A word of warning too - Utilities will often get in the way of optimising this :)

Database/Performance (last edited 2011-09-28 09:06:12 by adeuring)