822
Comment:
|
3890
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
= Overview = | |
Line 6: | Line 7: |
Design sketch: * bin/test --parallel runs N test runners with subunit, where N is the number of cores, and the tests are partitioned across runners. |
= Design sketch = * bin/test --parallel runs N test runners with subunit, where N is the number of cores, and the tests are partitioned across runners. (implemented) |
Line 9: | Line 10: |
* dbnames * config files |
* dbnames (implemented) * config files (implemented) |
Line 13: | Line 14: |
* keyserver work area * soyuz work area |
|
Line 17: | Line 20: |
= LXC containers and parallel testing = LXC containers combined with [[http://en.wikipedia.org/wiki/Aufs|aufs]] offer a pretty cheap way to get solid isolation - a great big hammer of a workaround for our existing globals (shared work dirs etc). William has put together a proof of concept, and Robert has made that [[https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/807351/+attachment/2251192/+files/lxc-start-aufs|generic]]. That combined with an updated .testr.conf (a TODO is to offer profiles for testr) like: {{{ [DEFAULT] test_command=lxc-start-aufs $LP_LXC_BASE $PWD xvfb-run $PWD/bin/test --subunit $IDOPTION $LISTOPT test_id_option=--load-list $IDFILE test_list_option=--list }}} will let testr run tests in a temporary container. (e.g. testr -- -t stories/gpg will fire up an aufs container and run the stories/gpg tests inside it). Be sure to export LP_LXC_BASE with the name of your lxc base container. See [[Running/LXC]] for info on setting up abase container. == Caveats == * If the base container is running it will be a disaster. Don't try. * aufs does not seem to permit deletes in some circumstances Bug:729338, so test fixtures which start by deleting a directory tree will fail if the directory tree exists. Known cases: * /var/tmp/testkeyserver.test * /var/lib/postgresql/8.4/main/postmaster.pid * /var/tmp/bazaar.launchpad.dev/mirrors * and conversely some need a tree: {{{ File "/home/robertc/source/launchpad/lp-branches/working/lib/canonical/testing/layers.py", line 1775, in startSMTPServer handler = logging.FileHandler(log_file) File "/usr/lib/python2.6/logging/__init__.py", line 819, in __init__ StreamHandler.__init__(self, self._open()) File "/usr/lib/python2.6/logging/__init__.py", line 838, in _open stream = open(self.baseFilename, self.mode) IOError: [Errno 2] No such file or directory: '/var/tmp/mailman/logs/smtpd' }}} - this was because buildmailman had not been run in the base container. * If we leak a child process with a shared stdout/stderr sshd will not terminate, which will cause the testr test runner to look like it has hung. Bug:820726. sudo pkill memcached can be used to work around this. == Workflow == * You need a temp directory in your source tree to workaround bug Bug:808557 {{{ mkdir temp }}} * Edit outside the container in your normal work area * Start the base container to do maintenance: make schema, bin/buildout {{{ lxc-start -n $basename -d }}} * ssh to it * make schema * bin/buildout * shut it down (e.g. with lxc-stop-n <name>, or poweroff -n, or your preferred method). Note that lxc is fragile at the moment, you may need to manually shutdown postgresql before stopping lxc, to get it to shutdown cleanly. * Run tests with testr: All tests {{{ TEMP=$(pwd)/temp testr run --parallel }}} Some tests {{{ TEMP=$(pwd)/temp testr run --parallel -- -t stories/gpg }}} |
Overview
Parallel testing would be nice. Theres a bunch of things to do to make it work. See the LEP for constraints/goals/resourcing.
Known bugs/issues: parallel test bugs
Design sketch
- bin/test --parallel runs N test runners with subunit, where N is the number of cores, and the tests are partitioned across runners. (implemented)
- Layers dynamically allocate/deallocate resources such as:
- dbnames (implemented)
- config files (implemented)
- librarian work dir
- librarian ports
- keyserver work area
- soyuz work area
Things that need specialist knowledge:
- dynamically allocating ports for zope - port 8085 and 9025 specifically, which can then be fed back into e.g. zcml files/launchpad.conf.
- Buildmaster slave tests hard code the xmlrpc port to 8221 everywhere.
LXC containers and parallel testing
LXC containers combined with aufs offer a pretty cheap way to get solid isolation - a great big hammer of a workaround for our existing globals (shared work dirs etc). William has put together a proof of concept, and Robert has made that generic. That combined with an updated .testr.conf (a TODO is to offer profiles for testr) like:
[DEFAULT] test_command=lxc-start-aufs $LP_LXC_BASE $PWD xvfb-run $PWD/bin/test --subunit $IDOPTION $LISTOPT test_id_option=--load-list $IDFILE test_list_option=--list
will let testr run tests in a temporary container. (e.g. testr -- -t stories/gpg will fire up an aufs container and run the stories/gpg tests inside it).
Be sure to export LP_LXC_BASE with the name of your lxc base container.
See Running/LXC for info on setting up abase container.
Caveats
- If the base container is running it will be a disaster. Don't try.
aufs does not seem to permit deletes in some circumstances 729338, so test fixtures which start by deleting a directory tree will fail if the directory tree exists. Known cases:
- /var/tmp/testkeyserver.test
- /var/lib/postgresql/8.4/main/postmaster.pid
- /var/tmp/bazaar.launchpad.dev/mirrors
- and conversely some need a tree:
File "/home/robertc/source/launchpad/lp-branches/working/lib/canonical/testing/layers.py", line 1775, in startSMTPServer handler = logging.FileHandler(log_file) File "/usr/lib/python2.6/logging/__init__.py", line 819, in __init__ StreamHandler.__init__(self, self._open()) File "/usr/lib/python2.6/logging/__init__.py", line 838, in _open stream = open(self.baseFilename, self.mode) IOError: [Errno 2] No such file or directory: '/var/tmp/mailman/logs/smtpd'
- this was because buildmailman had not been run in the base container. If we leak a child process with a shared stdout/stderr sshd will not terminate, which will cause the testr test runner to look like it has hung. 820726. sudo pkill memcached can be used to work around this.
Workflow
You need a temp directory in your source tree to workaround bug 808557
mkdir temp
- Edit outside the container in your normal work area
- Start the base container to do maintenance: make schema, bin/buildout
lxc-start -n $basename -d
- ssh to it
- make schema
- bin/buildout
shut it down (e.g. with lxc-stop-n <name>, or poweroff -n, or your preferred method).
- Note that lxc is fragile at the moment, you may need to manually shutdown postgresql before stopping lxc, to get it to shutdown cleanly.
- Run tests with testr: All tests
TEMP=$(pwd)/temp testr run --parallel
Some testsTEMP=$(pwd)/temp testr run --parallel -- -t stories/gpg