LEP/FastDowntime

Not logged in - Log In / Register

This LEP has been implemented. See the related documentation:

Fast downtime

Rather than extended (typically 60-90 minutes) of downtime, have short downtime windows multiple times a week.

Contact: RobertCollins
On Launchpad: https://bugs.launchpad.net/launchpad-project/+bugs?field.tag=fastdowntime
Future related work on Launchpad: https://bugs.launchpad.net/launchpad-project/+bugs?field.tag=fastdowntime-later

Rationale

Our development cycle times correlate very highly with schema changes. Technical limitations in our environment make applying schema changes require disconnecting all clients for a period of time. By making this short and designing our schema changes carefully we can dramatically simplify the way that we do downtime (most of the time), resulting in less overall downtime and faster delivery of features (with less churn on developer focus).

The basis for this change has been raised and hammered out on the stakeholders list; coding can start while further fine tuning is done on the -users list.

See also Database/LivePatching which documents some implementation issues as well as things we can do totally live (like adding new indices).

Stakeholders

All the LP stakeholders; particularly OEM who depend on LP to do daily releases.

User stories

developer-make-change

As a developer
I want to change Launchpads schema without waiting 4 weeks
so that I can fix a bug / improve functionality for users.

When a developer has a DB patch they can choose to try and deploy it in a fixed window downtime. They will broadly follow these steps:

Constraints and Requirements

Must

Nice to have

There are a lot of bells and whistles we could do, but they will be the focus of future completely distinct work: we want to deliver the core functionality as rapidly and reliably as possible.

Must not

Out of scope

Subfeatures

Success

How will we know when we are done?

We can reliably deploy schema changes 24 hours after they land in devel, with < 5 minutes downtime.

How will we measure how well we have done?

The project lead has cycletime graphs which reflect long cycle times for DB related projects: their cycle time should come way down : the further it comes down the better this project succeeded.

Thoughts?

Put everything else here. Better out than in.

LEP/FastDowntime (last edited 2011-12-22 18:22:22 by gary)