Diff for "Foundations/QA/OOPSToolsMiniSprint"

Not logged in - Log In / Register

Differences between revisions 9 and 19 (spanning 10 versions)
Revision 9 as of 2010-09-23 20:32:22
Size: 2438
Editor: ursinha
Comment:
Revision 19 as of 2010-10-01 19:26:20
Size: 2424
Editor: matsubara
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
Mini OOPS Tools sprint, taking place in Campinas, SP, Brazil, 20-24 September 2010. = OopsTools Sprint =
Line 3: Line 3:
== We're discussing ==  * Where: Campinas, SP, Brazil
 * When: 20-24 September 2010
Line 5: Line 6:
 * Bug Bug:461269: oops reports should be grouped by oops signature not exception type and exception value (better grouping of oopses; better relation oopses-real problems)
 * How to move this information to be easily accessible via web
 * Improve oops-tools to handle queries for a given pageid
Ursula and Diogo got together for a week to discuss improvements for https://edge.launchpad.net/oops-tools/.
One of the goals for this sprint was to have Ursula and Diogo hacking together
so Ursula can become more comfortable hacking on oops-tools.
Line 9: Line 10:
=== How to reduce all OOPS emails into one with grouped information === == Topics discussed ==
Line 11: Line 12:
Send only one email to the list including information broke down by team:     * how to get rid of the reports sent to the launchpad@ list, at most have just a single email sent (e.g. https://pastebin.canonical.com/37802/)
    * improve the content of the reports
    * provide web ui so developers can generate a customized report with oopses only interesting to them
    * how to fix bug Bug:461269 in a way that new oops attributes can be used to uniquely identify an infestation
    * change pageid to become a first class object rather than an oops attribute, so we can make queries and build reports for them.
Line 13: Line 18:
 * stats section is global, for all production instances
 * each team section shows top exception and top timeout and the percentage of oops reports the team is responsible for.
    * maybe show top offending pageid (i.e. the pageid with the greatest number of oopses) instead?
 * all team sections show link to a full team report
 * remove unnecessary sections from the report, such as soft time out, informational only, user generated errors
 * provide a way to find oopses that would appear in those sections through the web ui
 * remove unnecessary oopses from the report (e.g bug Bug:540890 and Bug:251896)
== User Stories ==
Line 21: Line 20:
{{{     * Deryck wants to see all OOPSes related to the Bugs team, but without the checkwatches noise.
    * Danilo wants to see all OOPSes related to Translations in a single report.
    * Robert wants to see reports grouped by pageid
    * Julian doesn't want to receive any more email
    * Francis, Robert and Jono wants to see an overall state of the production instances
    * Gary wants the connection between infestations and Launchpad bug report to be very reliable (i.e. once the tool is taught about a false positive, it should do the right thing the next time)
Line 23: Line 27:
Subject: Oops report for 2010-09-22 == Action items ==
Line 25: Line 29:
= Stats for 2010-09-22 =     * Bug Bug:652350: change ErrorSummary object to accept sections so it can be built dynamically
    * Bug Bug:652351: web ui so developers can generate reports customized to what they need (http://ubuntuone.com/p/HvI/)
    * Bug Bug:652356: page id should become a first class object
    * Bug Bug:652354: put the exception value normalization code into the database
    * Bug Bug:461269: new oops attributes, such as pageid, should be used to uniquely identify an infestation
    * --(File RT to have lp-production-configs on devpad automatically updated)-- RT #41653
    * Bug Bug:592355: team based oops summaries should use the infestation team information to better group oopses
Line 27: Line 37:
* 10000 Exceptions
* 50000 Time Outs
== Bugs fixed during the sprint ==
Line 30: Line 39:

== Bugs 20 % ==
Full report: https://lp-oops.canonical.com/summary/?team=Bugs&date=2010-09-22

 * 230 ConjoinedBugTaskError: 'Foo bar is foobared'
 * 440 TimeOut: 'some broken page'

== Code 50 % ==
Full report: https://lp-oops.canonical.com/summary/?team=Code&date=2010-09-22

 * 100 TooNewRecipeError: 'recipe is foobared'
 * 1140 TimeOut: 'some broken page'


== Foundations 10 % ==
Full report: https://lp-oops.canonical.com/summary/?team=Code&date=2010-09-22

 * 330 LibrarianDiskError: 'librarian is down'
 * 666 TimeOut: 'some other page is broken'

}}}

 * Better team reports
   * PageId to become an object, not just an attribute
     * Find out PageId 'owners'
     * Use django admin interface to set 'orphan' PageIds teams
     * Will have to migrate data to the new model
   * Change how an oops' team is calculated:
     * Prefix > PageId > vhost


== We're doing ==

 * Bug Bug:540890: exclude robot posts from reports


== We did ==

 * Bug Bug:612354: fix oops-tools bootstraping
 * Bug Bug:251896: oops-tools should filter out not found errors referred from non-local domains
    * Bug Bug:612354: fix oops-tools bootstraping
    * Bug Bug:251896: oops-tools should filter out not found errors referred from non-local domains

OopsTools Sprint

  • Where: Campinas, SP, Brazil
  • When: 20-24 September 2010

Ursula and Diogo got together for a week to discuss improvements for https://edge.launchpad.net/oops-tools/. One of the goals for this sprint was to have Ursula and Diogo hacking together so Ursula can become more comfortable hacking on oops-tools.

Topics discussed

  • how to get rid of the reports sent to the launchpad@ list, at most have just a single email sent (e.g. https://pastebin.canonical.com/37802/)

  • improve the content of the reports
  • provide web ui so developers can generate a customized report with oopses only interesting to them
  • how to fix bug 461269 in a way that new oops attributes can be used to uniquely identify an infestation

  • change pageid to become a first class object rather than an oops attribute, so we can make queries and build reports for them.

User Stories

  • Deryck wants to see all OOPSes related to the Bugs team, but without the checkwatches noise.
  • Danilo wants to see all OOPSes related to Translations in a single report.
  • Robert wants to see reports grouped by pageid
  • Julian doesn't want to receive any more email
  • Francis, Robert and Jono wants to see an overall state of the production instances
  • Gary wants the connection between infestations and Launchpad bug report to be very reliable (i.e. once the tool is taught about a false positive, it should do the right thing the next time)

Action items

  • Bug 652350: change ErrorSummary object to accept sections so it can be built dynamically

  • Bug 652351: web ui so developers can generate reports customized to what they need (http://ubuntuone.com/p/HvI/)

  • Bug 652356: page id should become a first class object

  • Bug 652354: put the exception value normalization code into the database

  • Bug 461269: new oops attributes, such as pageid, should be used to uniquely identify an infestation

  • File RT to have lp-production-configs on devpad automatically updated RT #41653

  • Bug 592355: team based oops summaries should use the infestation team information to better group oopses

Bugs fixed during the sprint

  • Bug 612354: fix oops-tools bootstraping

  • Bug 251896: oops-tools should filter out not found errors referred from non-local domains

Foundations/QA/OOPSToolsMiniSprint (last edited 2010-10-01 22:12:01 by gary)