Create a cache of statistics for bugtasks
The time it takes to calculate structural aggregates of bugs in Ubuntu is overly expensive and growing as more bugs are opened. This shows up as:
- bug portlets update up to 60 minutes after changes happen
- When they update they will sometimes time out : slow times to calculate bug summary portlets(series task counts, milestone task counts and untriaged etc counts.)
This LEP is about introducing an aggregate store which will let those counts be calculated more cheaply, at a potential loss of accuracy. Doing them more cheaply will allow:
- Immediate portlet updates
- No timeouts on bug portlets {for another 4-5 years}
Contact: RobertCollins
On Launchpad: bug 758587 List discussions: initial thread
As a User
I want Launchpad bug portlets to be fast
so that they show up quickly
This is a primarily technical change but has one distinct and possibly unsettling user visible change. Specifically, because of our model for visibilty of private bugs, the summary table can overcount visible private bugs.
The way this overcounting will happen is that when a user has multiple subscriptions to a private bug, they will see the bug counted once per subscription. For instance, if someone on the Ubuntu security team files a security bug on Ubuntu, they will see their 'open bug' portlet count go up by 2 - once for their direct subscription, and once for their indirect subscription via the Ubuntu security team.
This has the following properties:
- - public bug counts will be completely accurate. - counts of 0 will always show as zero. - The error rate for developers with access to private bugs is about 1 in 200. [We will show 201 when the actual figure should be 200]. - We should be able to render the existing portlets in 1/2 second.
While its not ideal to ever show erroneous information we have been showing erroneous data in the past and still do (the 60 minute memcache time mentioned above).
Rationale
Open bugs for Ubuntu constitute 1/7th of the entire bug database but all bugs for Ubuntu constitute 1/2 the bug database. As a result many aggregate queries end up processing the entire bug database (which requires about 2million rows to be processed).
Bringing in a table which preaggregates bugs that are the same from the perspective of the aggregations we do will permit us to provide broad aggregate statistics.
We can in principle query separately for private bugs but this will require optimisation of those queries vs what we do today and in addition to use the aggregate table for public bugs.
Stakeholders
Users.
Constraints and Requirements
Must
Nice to have
Must not
Show a non-zero count where a zero-count is expected: This would stop people driving bug counts to zero and be very frustrating.
Out of scope
Complete accuracy and precision in the short term. Medium/long term we can revisit and see if a schema can be come up with to count private bugs accurately and quickly.
Subfeatures
Success
How will we know when we are done?
How will we measure how well we have done?