QA tagger tools audit

This is a performance audit of the QA tagger tools, focusing on tag_bugs_to_qa.py.

Obsolete code

Bug 274074 is fixed released in wadllib 1.5.4, so you shouldn't need length() anymore.

You can now set a bugtask's assignee/status/milestone directly (and then call lp_save) rather than having to invoke transitionToAssignee/transitionToStatus/transitionToMilestone.

Extra HTTP requests

Here's some code from find_bugtask_and_assignee(). I've annotated the lines of code that make HTTP requests. "1" means the code makes one HTTP request. "N" means the code is inside a loop and makes one HTTP request for every loop element.

...
            # Checking if there's a task assigned to the committer,
            # and changing its status.
            found_it = False
1           for bug_task in fixed_bug.bug_tasks:
N                if not bug_task.assignee:
                    continue
N                if bug_task.assignee.self_link == assignee.self_link:
                    found_it = True
                    logger.info("Committer is assigned to the"
                        " bugtask.")
                    break

            if not found_it and project:
1               for bug_task in fixed_bug.bug_tasks:
N                   if (bug_task.target.self_link == project.self_link):
                        found_it = True
                        logger.info(
                            "Found bugtask of project %s." % project.name)
                        break
            if not found_it:
1               for bug_task in fixed_bug.bug_tasks:
N                   if (bug_task.target.self_link in self.lp_projects_links):
                        found_it = True
                        logger.info("Found bugtask of lp-project"
                            " group: %s." % bug_task.target.self_link)
                        break
            if not found_it:
                logger.error("This bug isn't ours! Not marking bug %s"
                    " as Fix committed." % fixed_bug.id)
                return None, False
...

In the worst case, you are making 4N+3 HTTP requests, where N is the number of bug tasks the bug has. Three of those requests are for collections, which are more expensive to process than requests for individual objects (though in this case the time is probably dwarfed by the HTTP latency).

Why so many requests? Well, look at the first two instances of 'N': if bug_task.assignee is not None, it gets fetched twice, and since that happens inside a loop, you get 2N.

Basically, every time you dereference a property of a launchpadlib object and get another launchpadlib object (or a collection), launchpadlib will make an HTTP request for that object. The key is to assigne these objects to variables, unless you really need the most up-to-date data and are worried about someone else changing the data while you work.

Similarly, every time you run this code:

            for bug_task in fixed_bug.bug_tasks:

launchpadlib makes a brand new request for the bug's bugtasks. This gives you the three instances of "1". Unless you're worried about bug tasks being added in the middle of your processing, you should assign list(fixed_bug.bug_tasks) to a variable and iterate over that list.

Putting all this together, I would rewrite find_bugtask_and_assignee() like this (at least for now):

...
            # Checking if there's a task assigned to the committer,
            # and changing its status.
            found_it = False
1           bug_tasks = list(fixed_bug.bug_tasks)
            for bug_task in bug_tasks:
N               bug_task_assignee = bug_task.assignee
                if not bug_task_assignee:
                    continue
                if bug_task_assignee.self_link == assignee.self_link:
                    found_it = True
                    logger.info("Committer is assigned to the bugtask.")
                    break

            if not found_it and project:
                targets = []
                for bug_task in bug_tasks:
N                   target = bug_task.target
                    if (target.self_link == project.self_link):
                        found_it = True
                        logger.info(
                            "Found bugtask of project %s." % project.name)
                        break
                    targets.append((bug_task, target))

            if not found_it:
                for bug_task, target in targets:
                    if (target.self_link in self.lp_projects_links):
                        found_it = True
                        logger.info("Found bugtask of lp-project"
                            " group: %s." % bug_task.target.self_link)
                        break
...

A rewrite that stores intermediary values in variables makes only 2N+1 requests instead of 4N+3. This required no changes to launchpadlib or the web service, and it only made the code a little more complicated. Note that the final loop doesn't make any HTTP requests at all, because it's working entirely on data that was stored locally in the previous loop.

Similarly, this code makes two requests for bug_task.milestone in the worst case:

        if ((bug_task.milestone is not None and
                self.milestone != bug_task.milestone.name) or
                bug_task.milestone is None):

This should do the same thing with one HTTP request:

        milestone = bug_task.milestone
        if milestone is None or self.milestone != milestone.name:

Something very similar happens in commit.py, in _filter_only_lp_bugs.

Server-side lessons

1. get_staging_revision() and get_edge_revision() screen-scrape the revision number. That's something we can publish on the service root (/1.0) or at the forthcoming multiversioned service root (/). We already use the revision number implicitly to force clients to fetch the WADL again when Launchpad is upgraded; I think it makes sense to also publish it explicitly.

2. This script could benefit from representation expansion. For the portion of find_bugtask_and_assignee() I looked at in detail, this would mean making one big HTTP request that got a bug's bug tasks along with each task's assignee and target. This way you could make one HTTP request instead of 2N+1.

3. When writing Python code it's natural to dereference objects very casually. That's why we wrote launchpadlib the way we did. But it leads to inefficient code that needs to be cleaned up with lots of variable assignments. But we could write launchpadlib to cache dereferenced objects rather than fetching them every time. This would allow developers to write code like find_bugtask_and_assignee() without worrying so much about performance.

The reason we didn't do this in the first place is that if launchpadlib caches dereferenced objects in memory, we have no way of knowing when to expire the cache. If we knew when the cache should be expired, Launchpad could serve the appropriate Cache-Control headers, and launchpadlib could cache those resources on disk, allowing them to persist (in some cases) even across script invocations.

Cache expiration doesn't seem like a problem when you're dealing with a quick-running script like tag_bugs_to_qa.py, but some launchpadlib scripts run for hours or are continuously running. If launchpadlib keeps bug.bug_tasks cached in memory forever, those scripts will never know that new data is available. It might make sense to implement in-memory object caching with a configurable expiration time.

Alternatively, we could implement a feature that allowed explicit markers for caches. For example, code might look something like this:

    lp.cache_start()
    ...do stuff that reuses cached values...
    lp.cache_stop()

Within the start and stop, values are cached. The "stop" flushes the cache, and outside of the start-stop markers, nothing is cached (that is, you get the current behavior).

Foundations/Webservice/ClientSidePerformance/QATaggerPerformanceAudit (last edited 2010-06-15 16:23:03 by gary)