Parallel Testing Status: 2012-03-21

Overview

These two weeks were a roller coaster for the project, moving back and forth between seemingly final failure and exhilarating success. We addressed over 2800 test failures, conquered hangs, and fixed issues in Launchpad, Testrepository, and LXC. Today we have our first unimpeded runs to completion on an eight-core EC2 machine. These runs took around 55 minutes, and had a handful of test failures.

We also made progress on related jobs, such as getting our Python shell tools packaged in Ubuntu, getting Python charm helpers added to the official charm helpers package, and moving along our replace-rocketfuel-* slack time project.

We still have a lot of work to do. We need to improve the continuous integration steps in a variety of ways, for stability, reporting, and speed; address the remaining test failures as we find them, including getting help on a kernel issue; complete experiments to determine the incremental value of cores to the parallelization; and get the smoothly oiled machine we have with Juju and EC2 also running manually in the data center. However, this update marks a major milestone for the project, and we are pleased to have accomplished what we did this week.

Progress towards biweekly action items

Other accomplishments

Progress on tracked items

Completed by others

New and incomplete

Carried over and incomplete

Goals for next meeting

  1. Bug fixes for test failures discovered in parallel test runs. Already known targets:
    • Launchpad
      • 953913: test isolation error
      • /dev/random exhaustion solution in setuplxc
    • Testrepository/zope.testing
      • 609986: subunit support for layer failures
    • Buildbot improvements
      • clean up old broken ephemeral lxc containers
      • keep .testrepository data around between builds
      • report failures more accurately
      • make tests always randomly ordered
    • Tracking
      • LXC 959352
  2. Deliver scaling assessment based on experimental results, using ec2 (carry over from previous two weeks)
  3. Get data center box running tests, and have a single comparison run with ec2.
    • /dev/random exhaustion solution approved and installed

Projects/ParallelTesting/Checkpoint-2012-03-21 (last edited 2012-03-22 16:20:07 by flacoste)