Diff for "Foundations/Updates/August2010"

Not logged in - Log In / Register

Differences between revisions 4 and 5
Revision 4 as of 2010-08-03 12:58:33
Size: 8328
Editor: ursinha
Comment:
Revision 5 as of 2010-08-03 14:38:10
Size: 8319
Editor: gary
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:

DRAFT

August 2010

Lucid/Python 2.6 upgrade

Software is ready. LOSAs are working on the upgrade. Foundations may be responsible for some of the image upgrades, but this is mostly now a LOSA task.

Slony and Postgres upgrade

The software changes have been done for both of these. We are going to actually migrate to Slony first. spm has backported the Slony packages from Lucid, and they will be deployed to Hardy very soon, hopefully in the next week or two. After that, the LOSAs will hopefully be able to migrate the DB servers to Lucid. At the moment this is out of Foundations' hands.

Webservice

  • Leonard stopped the performance effort after making some nice gains (https://dev.launchpad.net/Foundations/Webservice/Performance).

  • Leonard has started work on the desktop integration work (a script for managing OAuth tokens without opening a web browser).
  • Leonard and Benji are working on changing the len implementation for Soyuz bug 590708, which I incorrectly handled as a critical bug, interrupting the webservice desktop integration.

  • After we are out of the 590708 hole, and have finished desktop integration, Leonard and Benji will be working with jml to identify webservice usability improvements. Example candidate bugs: 534363 274074 481090 487522 539070 541637 583761. Leonard and Benji will then work to implement the selected improvements in lazr.restful and in the Launchpad webservice itself.

  • Diogo is leading an effort to make our webservice clients more easily QAd, with the help of Leonard, Martin Pitt and the bugs team (539705).

  • We're trying to make sure that webservice OOPSes and performance are reported clearly in our various tools (606184 and 607154).

Openid bugs

We had just started trying to dig ourselves out of the openid bug hole at the epic (https://bugs.edge.launchpad.net/launchpad-foundations/+bugs?field.tag=openid) but some other priorities pushed this work away. The majority of these are about the fact that we need to be able to associate multiple accounts with a single person. Stuart will be tackling 580461 in the next few weeks. Hopefully that will set us up to clean up the rest of the bugs like dominoes.

Launchpad performance

  • Reporting
    • We've already announced our performance report work. Stuart is finishing up some database report work. In general we've been very pleased with the information we've been able to gather from the tests we've assembled, and with the ability they give us to analyze our problems. We want to make sure that these reports are understandable and usable for everyone on the team who is interested, though, and we believe there's still some work there to be done.

    • Maris made progress with Robert on implementing ++profile++ (598289) but it needs to be picked back up before the work is complete.

  • According to our webpagetest.org tests, for the average page, the most obvious way to speed up our pages is in networking and client-side optimizations.
    • After an initial cut at a risk analysis of trying to remove HTTPS for some interactions with the site, we have rejected it. Instead, Robert and James Troupe are going to be doing a private VPN experiment to see if an approach like that can help our SSL and static resource costs, both of which are surprisingly high.
    • Maris also has some client-side improvements he will attempt soon.
    • Finally we also want to implement an ability to test page dependencies in the browser using Windmill, so that we can make assertions about the number and size of resources a page loads in tests, both initially and on a page reload. We think that will help us write tests to maintain fixes that we need to do to our pages (see 609885, for instance).

  • The work on reducing timeouts is even more important than I had thought--the outliers really are a big problem. According to our graphs, the average server page rendering time in Launchpad is in fact not inexcusably slow at all (median of 0.15, seconds, average of 0.36 seconds). Improving network and client side, as discussed above, will make a big difference with the average page. However, arguably the bigger problem in our usability (at least for users relatively close to the London data center) does appear to be the edge cases on the server side that are the outliers on our graphs. They happen fairly infrequently, but Launchpad is so big and with so many users that a relatively small statistic can be a big problem for one of the many constituents that we care about.

    • Stuart is giving his assistance on some SQL aspects of this.
    • We do plan to retry working on the Chameleon integration in the next few months, which seems to give about a 15% speed increase on average on the server. It should increase a few pages, perhaps such as problem page that Danilo showed me in Translations, even more.
    • We also have some ideas on how to make the OOPS tools and the OOPS reports more effective for both communication and research.
    • We have done some work on memcached and may do more.

Workflow changes

Robert is leading the way to a number of great workflow changes, mostly centered on continuous deployment, and we're supporting him with changes to the related machinery. Here's some details on progress for changes that we are a part of now.

Test speedups

Stuart is planning to investigate his long-proposed Mock Database work.

Build improvements

The changes we made to Buildout were accepted upstream by the maintainer, but a beta release caused issues particularly for people who used Buildout with virtualenv. A few people (Tim, for instance) reported some issues that have been squashed in the software but not deployed to Launchpad. A new beta release of Buildout should go out publicly this week, and be incorporated into Launchpad this week or after the upcoming LP release. This will be good for Landscape too, it seems.

This also should be a small part of some long-needed simplification and cleanup of our lazr packages to make the build easier to use and understand. Maris has been leading the cleanups there.

We're also hoping to find time to switch away from the local download-cache in favor of a shared stash of our sdists somewhere in the data center. This hopefully just takes a small amount of work, and once it is done it should make some things simpler for lazr packages and Launchpad's build too (and remove the download-cache-as-bzr-branch that has annoyed several folks).

Smoldering Fires

Robert's trying to improve our search. I'm sure that will take some of Stuart's time, at the least.

The Librarian needs some love. Robert is giving it some, bless him. There's a memory leak that needs to be addressed too (556245).

App server machines are going into swap every week or two now. We closed a memory leak, but there's at least another slow one we know about. We will probably need to tackle this again soon.

Foundations/Updates/August2010 (last edited 2010-08-03 14:38:10 by gary)