Diff for "PolicyAndProcess/Downtime"

Not logged in - Log In / Register

Differences between revisions 4 and 5
Revision 4 as of 2009-09-17 22:21:34
Size: 4278
Editor: kiko
Comment: wording and edits
Revision 5 as of 2009-09-23 20:58:42
Size: 5144
Editor: bac
Comment:
Deletions are marked like this. Additions are marked like this.
Line 20: Line 20:
  * OOPS report
  * Merge proposal
  * OOPS reports
  * Merge proposals
Line 28: Line 28:
  in PQM.   in PQM. (Monday 00:00 UTC)
Line 30: Line 30:
  * Update the `#launchpad-dev` topic to list him as release-manager.   * Update the `#launchpad-dev` topic to state we are in 'Release Critical' and to list the release manager.
Line 35: Line 35:
    continuously to ensure that the list of release blockers is up to date.     continuously to ensure that the list of release blockers is up-to-date.  (We need to explore a
    work-around to retire this wiki page and do the management in Launchpad.)
Line 37: Line 38:
    All bugs that are likely to cause lots of OOPSes, time outs or prevent     All bugs that are likely to cause lots of OOPSes, time-outs or prevent
Line 40: Line 41:
    It's a good idea to subscribe yourself to the page.     It's a good idea to subscribe yourself to the page.  (Currently broken.)
Line 44: Line 45:
  * Review release-critical merge proposals.   * Review release-critical merge proposals. The policy should be:
     * All RC candidates go through the normal review process.
     * After code and UI review the MP is left in 'Needs Review' state.
     * A new review of type 'release-critical' is added to the MP and assigned to the release manager.
     * If the MP is approved for 'release-critical', the review is marked 'Approve' and the state of the MP is set to 'Approved'.
Line 49: Line 54:
  * Request that landing to the `devel` branch be closed. (All changes    should on the last day be merged through `db-devel`.)   * Request that landing to the `devel` branch be closed, 24 hours before the scheduled release.  All changes should on the last day be merged through `db-devel`.
Line 58: Line 62:
  * Remind people that all changes need to be in buildbot for '''6 hours'''
  before the roll-out time.
  * With PQM remaining open, have the LOSAs stop buildbot and set it do manual runs.
  * Remind people that all changes need to be in buildbot for '''9 hours'''
  before the roll-out time. The LOSAs require two hours of pre-release preparation and we need
  to allow for two complete buildbot cycles. (9 = 2 + 2 * 3.5)
Line 79: Line 85:
  * The release-manager need to select the next release manager.   * The release-manager needs to select the next release manager.
Line 84: Line 90:
  merge proposal on Launchpad. The release manager simply adds a review of type
  `release-critical` to the merge proposal.
  merge proposal on Launchpad. The engineer adds a review of type
  `release-critical` to the merge proposal and ensures it is in the 'Needs Review' state.

  • Process Name: Release Manager Rotation Process

  • Process Owner: Francis Lacoste

  • Parent Process/Activity: None

  • Supported Policy: None

Process Overview

Each cycle a different engineer takes the role of release manager. The release manager coordinates with the release team and all team leads to ensure that the tree is ready for the roll-out and that all critical bugs are in or worked-around.

Back-up release managers are the two RMs from the previous two cycles.

Release Manager inputs

  • Email and IRC messages from engineers and team leads.
  • OOPS reports
  • Merge proposals

Activities

Before the roll-out

  • At the beginning of week 4. Make sure that release-critical was turned on in PQM. (Monday 00:00 UTC)
  • Update the #launchpad-dev topic to state we are in 'Release Critical' and to list the release manager.

  • Maintain the list of the Current roll-out blockers

    • The release manager should poll the team leads and QA engineers continuously to ensure that the list of release blockers is up-to-date. (We need to explore a work-around to retire this wiki page and do the management in Launchpad.) All bugs that are likely to cause lots of OOPSes, time-outs or prevent several users from working are good CRB candidates. It's a good idea to subscribe yourself to the page. (Currently broken.)
  • Make sure that developers are assigned to all problems we want to fix.
  • Review release-critical merge proposals. The policy should be:
    • All RC candidates go through the normal review process.
    • After code and UI review the MP is left in 'Needs Review' state.
    • A new review of type 'release-critical' is added to the MP and assigned to the release manager.
    • If the MP is approved for 'release-critical', the review is marked 'Approve' and the state of the MP is set to 'Approved'.

On the day before the roll-out

  • Request that landing to the devel branch be closed, 24 hours before the scheduled release. All changes should on the last day be merged through db-devel.

On the day of the roll-out

  • Chase up Current Rollout Blockers and any other pending release-critical fixes.

  • With PQM remaining open, have the LOSAs stop buildbot and set it do manual runs.
  • Remind people that all changes need to be in buildbot for 9 hours before the roll-out time. The LOSAs require two hours of pre-release preparation and we need to allow for two complete buildbot cycles. (9 = 2 + 2 * 3.5)

  • In the case of failures, it's best to roll-out the last-known-good-build rather than delaying the release. The cut-off point to decide which revision

    to roll out is 2 hours before the scheduled release.

After the roll-out

  • With the QA engineers, review the OOPS reports.
    • All common OOPSes are candidates for more release-critical fixes and scheduling another roll-out.
  • Prepare and schedule any necessary re-roll.
  • When a re-roll is needed, same activities than in the pre-roll out case.
  • Open the tree, when the released version is fine for the next cycle.
  • The release-manager needs to select the next release manager.

Release critical policy

  • To apply for a release-critical approval, you must have a reviewed merge proposal on Launchpad. The engineer adds a review of type

    release-critical to the merge proposal and ensures it is in the 'Needs Review' state.

  • Good candidates for release-critical approval are issues found during QA that are bound to create OOPSes and time outs or otherwise significantly inconvenience our end-users.
  • Apart from special exceptions discussed with the project lead, only bug fixes should be granted release-critical approval.
  • If there is no way for the developer to QA his change on staging through the normal update procedure before the roll-out, it's recommended to request a cowboy of the branch on staging to QA it before approval.
  • For the second roll-out (a.k.a. the re-roll), any change requiring database changes should go through the project lead, since DB updates seriously increase the length of the upgrade window.

Scheduling

  • Engineers apply in advance for one cycle.
  • They are selected by the previous release manager. Once selected, their name

    is put on the Launchpad Production Status page.

  • The actual roll-out time is determined by the release-manager's location:
    • Location

      Roll out time

      Americas

      00:00UTC

      Europe

      09:00UTC

      Asia/Pacific

      00:00UTC

  • No engineer should apply for the role more than twice a year.

PolicyAndProcess/Downtime (last edited 2011-06-06 22:02:02 by flacoste)