Diff for "BugTriage"

Not logged in - Log In / Register

Differences between revisions 14 and 15
Revision 14 as of 2011-05-03 10:15:46
Size: 11501
Editor: henninge
Comment: We don't use wishlist.
Revision 15 as of 2011-05-17 12:39:40
Size: 9868
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
||<tablestyle="width: 100%;" colspan=3 style="background: #2a2929; font-weight: bold; color: #f6bc05;">This page is about triaging Launchpad-related bugs. For general information on handling bug reports, see [[BugHandling]]. If you have any questions, [[Help|ask for help]] right away. || = Triaging Launchpad bugs =
Line 3: Line 3:
= Launchpad Bug Triage = ||<tablestyle="float:right; font-size: 0.9em; width:40%; background:#F1F1ED; margin: 0 0 1em 1em;" style="padding:0.5em;"><<TableOfContents>>||
Line 5: Line 5:
== What is bug triage? == Our triage process is basically this: make sure that ''Critical'' and ''High'' bugs are correctly marked.
Line 7: Line 7:
Triage is the act of sorting bugs into different priority groups.
There are many conflicting sorts - everyone has their pet bug that
should be 'first'. The sort order we choose is from the projects
perspective: we try to balance the needs of our users.
We want:
Line 12: Line 9:
So, bug triage is: '''sorting bugs by importance-to-the-project''',
and these are the influences we try to strike a balance between in
assessing that importance:
 * Things affecting launchpad project health.
 * Things affecting stakeholders
 * Things affecting other users
 * ''Critical'' bugs to be those that need attention before all others. Right now: OOPSes, timeouts, regressions, stakeholder-escalated bugs.
 * The ''High'' bugs list to be around six months deep. Many parts of Canonical are on a six month cycle and fitting in with that is convenient.
Line 19: Line 12:
When we have triaged a bug, it has status '''triaged''' and an
importance other than '''unknown'''.

== Why triage ==

This may be obvious, but having just a big bucket of open bugs isn't
very efficient: there are more genuinely important issues to fix than
engineers, and as such engineers will forget what things are urgent
and what aren't.

Secondly, each of the groups of users whose needs we're trying to
compromise between are interested in when things will get done. By
sorting the bugs we provide a proxy metric for when tasks will be
worked on.

== How much triage is needed? ==

The world is dynamic and constantly changing; as such any sort we come
up with for our bugs will be outdated pretty quickly. We could make
the sort complete (so all bugs are ranked) and constantly refresh it.
However this is inefficient: the only times the sort actually matters
are:
 * when a new bug is being selected to work on (by project importance).
 * when a user is taking a decision based on how long until the bug is
 likely to be worked on. For instance, they might decide to work up a
 patch, or whether to use Launchpad at all.

So how much sorting is enough? Two interesting metrics are freshness
and completeness.

If the sort is too old, bugs will be indicated as 'should be next to
work on' that are not valid as that any more. Our priorities may
change month to month but they rarely change faster than that : so we
can tolerate things being months (or more) stale.

The sort is complete enough if the answers to 'what is an important
bug to work on now' and questions that users may ask (like 'how long
till this will be worked on') get answers accurate enough... and how
accurate do we need?

Well that's a tradeoff, but we think the answers are accurate enough
if:
 * users can see that we care about performance, regressions,
 usability and polish
 * engineers selecting 'next bug to work on' based on the triage sort
 usually pick things that are the most useful thing to the
 project/stakeholders/users; that is that inconsequential stuff is
 tackled after consequential stuff

== Bug Importance ==

Bug importance in Launchpad is where we record the result of the
triage process; we have 5 buckets we can use in Launchpad:
critical/high/medium/low/wishlist.

We don't actually ever block a release based on having a particular
importance bug - we block releases based on having regressions, which
any commit can have - and we mark that on the bug mapping to the
commit.

The buckets combine to give a partial sort: bugs in the critical
bucket are sorted before bugs in the high bucket.

We can choose to use some or all of these 5 buckets.

How many do we need? A good way to answer that is to consider our
hypothetical complete, fresh sort, and consider how many slices we'd
need to make in it to answer questions well; we also need to consider
what would change to those slices when things change (such as new
things coming that sort to the front).

Also buckets have a cost : we need a ruleset for triage that will let
us assign bugs to buckets: every bucket makes the heuristics more
complex.

Given that we have a freshness tolerance for most bugs of some months,
that we don't want to update many bugs when a single bugshuffles in
front, and that because we have more bugs coming in than we fix - we
need three or perhaps four buckets:

 * A topmost bucket that is generally empty and crisis bugs go into.
 * A default bucket that bugs we haven't picked out as being important
 enough to sort above any other specific bug go into.
 * [optional] a bucket for bugs that are reasonably important but not
 extremely so
 * And a bucket containing bugs which are within the first 6 months of work

We map these buckets into:
 * critical : generally empty, bugs that need to jump the queue go here.
 * high: bugs that are likely to get attention within 6 months
 * low: All other bugs.
 * We don't use wishlist.

This has a clear tension: time-till-we-start-work is a good metric for
what bucket to put in, but given a bug with some symptoms how do we
decide what bucket it should go into.

To address this tension we use two things:
 * A quarterly review of the bugs in the high bucket, to stop it overflowing.
 * Some heuristics for sorting bugs

== Quarterly review ==

This is pretty simple - we re-triage bugs with high importance to see
if things have changed and they should be downgraded. For upgrades we
assume that user prompting will cause us to upgrade them.

== Triage guidelines ==

These guidelines describe the rules we use to sort bugs - and from
that sort we assign bugs to bugs. We broadly want:
 * queue jumping bugs to be in the critical bucket. (OOPS, timeouts,
 regressions, stakeholder-escalated bugs are all examples of queue
 jumping bugs)
 * the high bucket to be about 6 months deep - many parts of Canonical
 are on a 6-month cycle and fitting in with that is convenient

The quarterly review is responsible for shrinking the high bucket if
it's too full.

What we need to do then in assessing the bucket for a bug is to do
*enough* sorting on it to see if it's a queue jumper, of it's more
important than the least important bug currently in the high bucket.
Beyond that, all bugs are in the low bucket.

If a bug is a regression : if the thing *was* working and now isn't,
we sort it higher. We're currently discussing having a policy that
regressions are critical, which if implemented will make these queue
jumpers (critical bucket).

If the bug is one that has been escalated via the Launchpad
stakeholder process, it is a queue jumper (critical bucket).

OOPS and timeout bugs also jump the queue: performance is very important
to our stakeholders and OOPS dramatically affect our ability to
operate and maintain Launchpad as well as being a very negative
experience when encountered. The [[https://dev.launchpad.net/PolicyAndProcess/ZeroOOPSPolicy|ZeroOopsPolicy]]
contains details on this.

For things like browser support, when a new browser is released but
the vendor is in our supported-browser-set, we should treat issues as
regressions and so they will be queue jumpers.

Beyond these rules a bug is more important than another bug if fixing
it will make Launchpad more better than fixing the other bug.
Discretion and a feel for whats in the bug database will help a lot
here, as will awareness of our userbase and their needs. One sensible
heuristic is to look at 5-10 existing high bugs, and if the new bug is
less important than all of them, mark it low (it's probably less
important than all existing high bugs).

Engineers have discretion to decide any particular bug should be
sorted higher (or lower) than it has been; some change requests are
very important to many of our users while still not big enough to need a
dedicated feature-squad working on them (so these bugs may be high).
When two engineers disagree,
or if someone in the management chain disagrees, common sense and
courtesy should be used in resolving the disagreement.
We use a [[#quarterly|quarterly review]] to shrink the ''High'' list if it looks like more than six months of work.
Line 180: Line 16:
Visit [[https://bugs.launchpad.net/launchpad-project/+bugs?field.importance:list=Unknown&field.importance:list=Undecided|unknown/undecided importance bugs]] and
[[https://bugs.launchpad.net/launchpad-project/+bugs?field.status:list=NEW&field.status:list=INCOMPLETE_WITH_RESPONSE&field.status:list=INCOMPLETE_WITHOUT_RESPONSE&field.status:list=CONFIRMED|untriaged status bugs]]
These are the questions we ask when triaging bug reports about Launchpad:
Line 183: Line 18:
For each bug:
 * See if there are any duplicates by having a bit of a look around,
 search your memory etc. If you find a duplicate, mark the the
 '''newer''' bug as a duplicate of the '''older''' bug (unless there
 is a compelling reason to use the newer bug as the master. Consider
 updating the description and tags of the '''older''' bug to help make
 it clearer. We use the '''older''' bug by default because we
 (roughly) work through bugs in the same bucket in date order.
 * If the bug is unrelated to Launchpad, move it somewhere appropriate.
 * If the bug is something we won't do at all, mark it as won't fix.
 * If it's a operational request, convert it to a question.
 * apply the guidelines in 'Triage Guidelines' to get a bucket for the
 bug and set the bug importance to that bucket.
 * If the bug status is 'Incomplete', check that the filer was asked
 to clarify something; if they were and haven't replied in a month,
 close the bug. Otherwise either ask them to clarify something, or set
 the bug to Triaged if they have clarified whatever was needed.
 * If the bug status is New, set it to triaged.
 1. '''Is this a bug in Launchpad?''' If not, move it to the appropriate project and move to the next bug.
 1. '''Is it a duplicate?''' if there is a duplicate, mark the newer bugs as a duplicate of the older bug ([[#duplicates|read more about duplicates]]).
 1. '''Is it something we'll never do?''' If yes, mark it as ''Won't Fix''.
 1. '''Is it an operational request?''' If yes, covert it to a question.
 1. '''When are we likely to fix this?''' Set the importance to show when we'll get to fixing this bug ([[#importance|read more about choosing an importance]]).
 1. '''Does the report have enough detail?''' If we couldn't replicate or otherwise begin work on the bug with information provided, request further information from the reporter and mark it as ''Incomplete''. If someone has already asked for more info and the reporter has replied, change the status from ''Incomplete'' to ''Triaged''.
 1. '''Is the bug ready for a developer to fix?''' If yes, set the status to ''Triaged''.
 
As you might expect, we give a triaged bug the ''Triaged'' status.
Line 202: Line 28:
== Assignment == If you're uncertain what importance to give a bug, chat with another engineer. If there's a disagreement, let common sense and courtesy take priority.
Line 204: Line 30:
Bug triage does not involve assigning an engineer. Engineers should
only be assigned to bugs that are ''in progress''. Even critical bugs
do not need an engineer assigned: operational incidents are not
tracked in the bug database, though critical bugs may be generated as
followup work to be done; those bugs are then in the front-section of
the queue, but that's all that is needed.
Need help? [[Help|Talk to someone]].
Line 211: Line 32:
== Selecting bugs to work on == == Quick links ==
Line 213: Line 34:
The bug database holds the /project/ importance set of bugs. However
individual or squad work-queues may be quite different. For instance,
we have 3 squads working on features at any one time, 2 on
maintenance. Generally speaking squads on feature-rotation will ignore
'importance' in selecting what to work on - they will be working on a
feature and creating bugs as appropriate to create discussion points
and todo items for that feature.
||<tablestyle="width: 60%;" style="background: #2a2929; font-weight: bold; color: #f6bc05;">All of Launchpad||[[https://bugs.launchpad.net/launchpad-project|All]]||[[https://bugs.launchpad.net/launchpad-project/+bugs?search=Search&field.status=New|New]]||[[https://bugs.launchpad.net/launchpad-project/+bugs?field.status:list=NEW&field.status:list=INCOMPLETE_WITH_RESPONSE&field.status:list=INCOMPLETE_WITHOUT_RESPONSE&field.status:list=CONFIRMED|Untriaged bugs with no importance]]||[[https://bugs.launchpad.net/launchpad-project/+bugs?field.status:list=NEW&field.status:list=INCOMPLETE_WITH_RESPONSE&field.status:list=INCOMPLETE_WITHOUT_RESPONSE&field.status:list=CONFIRMED|Untriaged bugs that have a status]]||[[https://bugs.launchpad.net/launchpad-project/+bugs?field.searchtext=&orderby=-importance&search=Search&field.status%3Alist=TRIAGED&assignee_option=any&field.assignee=&field.bug_reporter=&field.bug_supervisor=&field.bug_commenter=&field.subscriber=&field.tag=&field.tags_combinator=ANY&field.has_cve.used=&field.omit_dupes.used=&field.omit_dupes=on&field.affects_me.used=&field.has_patch.used=&field.has_branches.used=&field.has_branches=on&field.has_no_branches.used=&field.has_no_branches=on&field.has_blueprints.used=&field.has_blueprints=on&field.has_no_blueprints.used=&field.has_no_blueprints=on|Triaged]]||[[https://bugs.launchpad.net/launchpad-project/+bugs?search=Search&field.importance=Critical&field.status=New&field.status=Incomplete&field.status=Confirmed&field.status=Triaged&field.status=In+Progress&field.status=Fix+Committed|Critical]]||
||<style="background: #2a2929; font-weight: bold; color: #f6bc05;">Launchpad itself||[[https://bugs.launchpad.net/launchpad|All]]||[[https://bugs.launchpad.net/launchpad/+bugs?search=Search&field.status=New|New]]||[[https://bugs.launchpad.net/launchpad/+bugs?field.status:list=NEW&field.status:list=INCOMPLETE_WITH_RESPONSE&field.status:list=INCOMPLETE_WITHOUT_RESPONSE&field.status:list=CONFIRMED|Untriaged bugs with no importance]]||[[https://bugs.launchpad.net/launchpad/+bugs?field.status:list=NEW&field.status:list=INCOMPLETE_WITH_RESPONSE&field.status:list=INCOMPLETE_WITHOUT_RESPONSE&field.status:list=CONFIRMED|Untriaged bugs that have a status]]||[[https://bugs.launchpad.net/launchpad/+bugs?field.searchtext=&orderby=-importance&search=Search&field.status%3Alist=TRIAGED&assignee_option=any&field.assignee=&field.bug_reporter=&field.bug_supervisor=&field.bug_commenter=&field.subscriber=&field.tag=&field.tags_combinator=ANY&field.has_cve.used=&field.omit_dupes.used=&field.omit_dupes=on&field.affects_me.used=&field.has_patch.used=&field.has_branches.used=&field.has_branches=on&field.has_no_branches.used=&field.has_no_branches=on&field.has_blueprints.used=&field.has_blueprints=on&field.has_no_blueprints.used=&field.has_no_blueprints=on|Triaged]]||[[https://bugs.launchpad.net/launchpad/+bugs?search=Search&field.importance=Critical&field.status=New&field.status=Incomplete&field.status=Confirmed&field.status=Triaged&field.status=In+Progress&field.status=Fix+Committed|Critical]]||
Line 221: Line 37:
The Launchpad maintenance squads however will usually be working from
the bug database - picking bugs up to work on based on their ''triaged
importance''. So for maintenance squads, they should simply look in
each bucket in order - critical, high, low - and from within that
bucket take one of the oldest bugs - one that seems interesting to
them at the time. Crucially though, all bugs in the critical bucket
should have someone or some squad working on them before any bugs in
the high bucket are picked up and worked on, and likewise for low.
<<Anchor(importance)>>
= Importance =
Line 230: Line 40:
Community work will often ignore our bug triage and focus on itch
scratching - and this also applies to patches done by Launchpad
engineers in their personal and slack time: the selection logic for
picking a bug only applies to effort being put in as part of their
primary duties. That is, it's always totally ok to fix that low
priority bug that's really annoying you, whether you're a user of
Launchpad or a developer. A bug fix is a bug fix!
We use three of Launchpad's bug importances and give each a specific meaning.

||<tablestyle="width: 60%;" rowstyle="background: #2a2929; font-weight: bold; color: #f6bc05;">~+Importance+~||~+Meaning+~||
||<style="font-weight: bold; color: #e01010;"> ~+{{attachment:bug-critical.png}} Critical+~||Bugs that need to jump the queue. When all is well, we should have no Critical bugs.||
||<style="color: #f96413; font-weight: bold;">~+{{attachment:bug-high.png}} High+~||Bugs that are likely to get attention in the next six months.||
||<style="color: #d1d03c; font-weight: bold;">~+{{attachment:bug-low.png}} Low+~||All other bugs.||

The importance of a particular bug report reflects the priorities of the Launchpad project. Individuals working on Launchpad may have different priorities. ([[#selecting|Read more about selecting bugs to work on]])

<<Anchor(critical)>>
== Critical ==

Any bug marked ''Crtitical'' takes priority over all other bugs.

At present, timeouts, OOPSes (thanks to our [[https://dev.launchpad.net/PolicyAndProcess/ZeroOOPSPolicy|zero OOPS policy]], regressions (including supported-browser issues) and stakeholder escalations are all marked as ''Critical''. Other types of bug may also be ''Critical''; Francis or Robert will expect you to justify marking any other type of bug as ''Critical''.

If all is well with Launchpad, there should be no ''Critical'' bugs.

<<Anchor(high)>>
== High ==

These are bugs that we believe we will work on in the next six months.

<<Anchor(low)>>
== Low ==

We mark as ''Low'' any bug that we recognise as legitimate but that we have no plans to fix. This is not the same as planning not to fix the bug; it means that we don't know when we will fix it, if at all.

== Others ==

We do not use ''Medium'' or ''Wishlist''. This is primarily to avoid giving false hope to people who are interested in a bug that is neither ''Critical'' nor ''High'': if it does not have one of these statuses, we think it is unlikely we will fix it in the next six months.

= Assigning bugs =

We do not assign bugs as part of the triage process. Only ''In progress'' bugs should be assigned to someone.

Even ''Critical'' bugs do not need an assignee, unless they are being worked on. Being at the top of the queue is all we need for ''Critical'' bugs to get the attention they require.

<<Anchor(selecting)>>
= Selecting bugs to work on =

If you are working on Launchpad in your own time you'll most likely want to fix those bugs that matter to you, regardless of what importance the Launchpad project gives them. That's great and we welcome all bug fixes; we encourage you to look at [[FixBugs|our page about fixing bugs]] first.

Members of Canonical's Launchpad team will select bugs depending on whether they're in a maintenance or feature squad.

Generally speaking, squads on feature-rotation will consider the importance of a bug only after filtering for work that applies directly to their current feature.

Maintenance squads, however, will usually be working from the bug database: picking bugs based on their triaged importance. They should look at each importance in order &mdash; critical, high, low &mdash; and from within that bucket take one of the oldest bugs. Crucially though, there should be no ''Critical'' bugs before they start work on ''High'' bugs. Similarly, ''Low'' bugs should get attention only when there are no ''Critical'' and no ''High'' bugs.

<<Anchor(quarterly)>>
= Quarterly review =

Four times a year, we put all of the ''High'' bugs back through the triage process. This lets us make sure that all those bugs really should be ''High'' and to take account of anything that has changed since they were last triaged.

= Resolving disuputes =

Beyond these rules a bug is more important than another bug if fixing it will make Launchpad more better than fixing the other bug.

Discretion and a feel for whats in the bug database will help a lot here, as will awareness of our userbase and their needs. One sensible heuristic is to look at five to ten existing ''High'' bugs and, if the new bug is less important than all of them, mark it ''Low'' as it's probably less important than all existing ''High'' bugs.

Engineers have discretion to decide any particular bug should be sorted higher (or lower) than it has been; some change requests are very important to many of our users while still not big enough to need a dedicated feature-squad working on them.

When two engineers disagree, or if someone in the management chain disagrees, common sense and courtesy should be used in resolving the disagreement.

Triaging Launchpad bugs

Our triage process is basically this: make sure that Critical and High bugs are correctly marked.

We want:

  • Critical bugs to be those that need attention before all others. Right now: OOPSes, timeouts, regressions, stakeholder-escalated bugs.

  • The High bugs list to be around six months deep. Many parts of Canonical are on a six month cycle and fitting in with that is convenient.

We use a quarterly review to shrink the High list if it looks like more than six months of work.

How to triage

These are the questions we ask when triaging bug reports about Launchpad:

  1. Is this a bug in Launchpad? If not, move it to the appropriate project and move to the next bug.

  2. Is it a duplicate? if there is a duplicate, mark the newer bugs as a duplicate of the older bug (read more about duplicates).

  3. Is it something we'll never do? If yes, mark it as Won't Fix.

  4. Is it an operational request? If yes, covert it to a question.

  5. When are we likely to fix this? Set the importance to show when we'll get to fixing this bug (read more about choosing an importance).

  6. Does the report have enough detail? If we couldn't replicate or otherwise begin work on the bug with information provided, request further information from the reporter and mark it as Incomplete. If someone has already asked for more info and the reporter has replied, change the status from Incomplete to Triaged.

  7. Is the bug ready for a developer to fix? If yes, set the status to Triaged.

As you might expect, we give a triaged bug the Triaged status.

If you're uncertain what importance to give a bug, chat with another engineer. If there's a disagreement, let common sense and courtesy take priority.

Need help? Talk to someone.

All of Launchpad

All

New

Untriaged bugs with no importance

Untriaged bugs that have a status

Triaged

Critical

Launchpad itself

All

New

Untriaged bugs with no importance

Untriaged bugs that have a status

Triaged

Critical

Importance

We use three of Launchpad's bug importances and give each a specific meaning.

Importance

Meaning

bug-critical.png Critical

Bugs that need to jump the queue. When all is well, we should have no Critical bugs.

bug-high.png High

Bugs that are likely to get attention in the next six months.

bug-low.png Low

All other bugs.

The importance of a particular bug report reflects the priorities of the Launchpad project. Individuals working on Launchpad may have different priorities. (Read more about selecting bugs to work on)

Critical

Any bug marked Crtitical takes priority over all other bugs.

At present, timeouts, OOPSes (thanks to our zero OOPS policy, regressions (including supported-browser issues) and stakeholder escalations are all marked as Critical. Other types of bug may also be Critical; Francis or Robert will expect you to justify marking any other type of bug as Critical.

If all is well with Launchpad, there should be no Critical bugs.

High

These are bugs that we believe we will work on in the next six months.

Low

We mark as Low any bug that we recognise as legitimate but that we have no plans to fix. This is not the same as planning not to fix the bug; it means that we don't know when we will fix it, if at all.

Others

We do not use Medium or Wishlist. This is primarily to avoid giving false hope to people who are interested in a bug that is neither Critical nor High: if it does not have one of these statuses, we think it is unlikely we will fix it in the next six months.

Assigning bugs

We do not assign bugs as part of the triage process. Only In progress bugs should be assigned to someone.

Even Critical bugs do not need an assignee, unless they are being worked on. Being at the top of the queue is all we need for Critical bugs to get the attention they require.

Selecting bugs to work on

If you are working on Launchpad in your own time you'll most likely want to fix those bugs that matter to you, regardless of what importance the Launchpad project gives them. That's great and we welcome all bug fixes; we encourage you to look at our page about fixing bugs first.

Members of Canonical's Launchpad team will select bugs depending on whether they're in a maintenance or feature squad.

Generally speaking, squads on feature-rotation will consider the importance of a bug only after filtering for work that applies directly to their current feature.

Maintenance squads, however, will usually be working from the bug database: picking bugs based on their triaged importance. They should look at each importance in order — critical, high, low — and from within that bucket take one of the oldest bugs. Crucially though, there should be no Critical bugs before they start work on High bugs. Similarly, Low bugs should get attention only when there are no Critical and no High bugs.

Quarterly review

Four times a year, we put all of the High bugs back through the triage process. This lets us make sure that all those bugs really should be High and to take account of anything that has changed since they were last triaged.

Resolving disuputes

Beyond these rules a bug is more important than another bug if fixing it will make Launchpad more better than fixing the other bug.

Discretion and a feel for whats in the bug database will help a lot here, as will awareness of our userbase and their needs. One sensible heuristic is to look at five to ten existing High bugs and, if the new bug is less important than all of them, mark it Low as it's probably less important than all existing High bugs.

Engineers have discretion to decide any particular bug should be sorted higher (or lower) than it has been; some change requests are very important to many of our users while still not big enough to need a dedicated feature-squad working on them.

When two engineers disagree, or if someone in the management chain disagrees, common sense and courtesy should be used in resolving the disagreement.

BugTriage (last edited 2022-04-19 09:36:39 by lgp171188)