DevelopmentMeeting20090820

Not logged in - Log In / Register

   1 <matsubara>	#startmeeting
   2 <MootBot>	Meeting started at 10:00. The chair is matsubara.
   3 <MootBot>	Commands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE]
   4 <matsubara>	Welcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues. 
   5 <matsubara>	[TOPIC] Roll Call 
   6 <MootBot>	New Topic:  Roll Call
   7 <sinzui>	me
   8 <matsubara>	Not on the Launchpad Dev team? Welcome! Come "me" with the rest of us! 
   9 <gary_poster>	me
  10 <Ursinha>	me
  11 <danilos>	me
  12 <matsubara>	stub, cprov, herb, rockstar, intellectronica: hi
  13 <cprov>	me
  14 <rockstar>	ni!
  15 <mthaddon>	me
  16 <intellectronica>	me
  17 <matsubara>	hi mthaddon 
  18 <mthaddon>	matsubara: herb won't be attending these meetings any more since he's no longer a LOSA
  19 <Ursinha>	it's true
  20 <matsubara>	mthaddon, indeed!
  21 <matsubara>	let me update the page
  22 <Chex>	hello!
  23 <mthaddon>	matsubara: most likely Chex will be his replacement (given he's on the same timezone that herb was on)
  24 <Ursinha>	hi Chex, welcome!
  25 <matsubara>	mthaddon, all right thanks
  26 <matsubara>	hi Chex, welcome
  27 <Chex>	all: thank you
  28 <stub>	moo
  29 <matsubara>	ok, everyone is here
  30 <matsubara>	[TOPIC] Agenda 
  31 <MootBot>	New Topic:  Agenda
  32 <intellectronica>	hi Chex, welcome
  33 <matsubara>	 * Actions from last meeting
  34 <matsubara>	 * Oops report & Critical Bugs & Broken scripts
  35 <matsubara>	 * Operations report (mthaddon/herb/spm)
  36 <matsubara>	 * DBA report (stub)
  37 <matsubara>	[TOPIC] * Actions from last meeting
  38 <Ursinha>	matsubara, you'll may want to s/flacoste/gary_poster in that page
  39 <MootBot>	New Topic:  * Actions from last meeting
  40 <matsubara>	Ursinha, already done
  41 <Ursinha>	matsubara, thanks
  42 <Andre_Gondim>	me
  43 <matsubara>	  * ursinha to chase mars about OOPS-1307J16 and file a bug about it
  44 <matsubara>	  * matsubara to file a bug for OOPS-1315A253
  45 <matsubara>	    * Filed https://launchpad.net/bugs/413706
  46 <matsubara>	  * sinzui to file bugs for OOPS-1318S626, OOPS-1321EB223 and OOPS-1318EA4
  47 <matsubara>	  * gary_poster to chase librarian-gc failure and report back to the list
  48 <matsubara>	  * matsubara to ask stub to email the dba report to the list
  49 <matsubara>	    * stub sent the dba report to the list
  50 <ubottu>	https://lp-oops.canonical.com/oops.py/?oopsid=1307J16
  51 <ubottu>	https://lp-oops.canonical.com/oops.py/?oopsid=1315A253
  52 <ubottu>	https://lp-oops.canonical.com/oops.py/?oopsid=1318S626
  53 <ubottu>	https://lp-oops.canonical.com/oops.py/?oopsid=1321EB223
  54 <ubottu>	https://lp-oops.canonical.com/oops.py/?oopsid=1318EA4
  55 <ubottu>	Ubuntu bug 413706 in launchpad-foundations "InvalidURIError using %s as the search term in the global search" [Undecided,New]
  56 <matsubara>	hi Andre_Gondim, welcome
  57 <Andre_Gondim>	thanks =]
  58 <matsubara>	hi sinzui, did you file those bugs?
  59 <matsubara>	Ursinha, no news about that oops? shall I remove the action item?
  60 <Ursinha>	matsubara, do that, I'll file a bug if that happens again
  61 <matsubara>	Ursinha, thanks
  62 *	sinzui has no screen
  63 <matsubara>	re: the librarian-gc failure, it was disabled that week, that's why we had a script failure email to the list
  64 <gary_poster>	stub is working on that as his next task
  65 <mthaddon>	I think there's a CP pending approval for that
  66 <sinzui>	matsubara: I did file bugs
  67 <matsubara>	gary_poster, mthaddon: cool. thanks
  68 <stub>	The next bit of work on the librarian may be related - depends on what happens with the cherry pick and test run ;)
  69 <Ursinha>	gary_poster, this is bug 410576, right?
  70 <sinzui>	OOPS-1315A253 is soyuz
  71 <ubottu>	Launchpad bug 410576 in launchpad-foundations "Librarian-gc discovered file missing from disk" [Critical,Triaged] https://launchpad.net/bugs/410576
  72 <ubottu>	https://lp-oops.canonical.com/oops.py/?oopsid=1315A253
  73 <matsubara>	sinzui, thanks. if you have them handy, could you priv msg them to me?
  74 *	gar0t0 (n=gar0t0@unaffiliated/gar0t0) has joined #launchpad-meeting
  75 <sinzui>	bug 413174
  76 <ubottu>	Launchpad bug 413174 in launchpad-registry "API AssertionError creating a release" [Low,Triaged] https://launchpad.net/bugs/413174
  77 <gary_poster>	Ursinha, that's not my understanding.  hm, that's a dupe.
  78 <Ursinha>	gary_poster, a dupe? is there another?
  79 <Ursinha>	this one is set as Critical... I'll talk about it in the next section :)
  80 <sinzui>	matsubara: OOPS-1318EA4 is new. It relates to another bug that I intend to fix in 3.0 I will file and assign it
  81 <ubottu>	https://lp-oops.canonical.com/oops.py/?oopsid=1318EA4
  82 <matsubara>	thanks sinzui 
  83 <gary_poster>	Ursinha: either dupe or related: bug 413749
  84 <ubottu>	Bug 413749 on http://launchpad.net/bugs/413749 is private
  85 <Ursinha>	gary_poster, let me see
  86 <Ursinha>	matsubara, you can move to the next section and we keep discussing there
  87 <matsubara>	ok, thanks Ursinha and gar0t0 
  88 <matsubara>	err
  89 <matsubara>	gary_poster, 
  90 <gary_poster>	:-)
  91 <matsubara>	[TOPIC] * Oops report & Critical Bugs & Broken scripts
  92 <MootBot>	New Topic:  * Oops report & Critical Bugs & Broken scripts
  93 <matsubara>	there you go Ursinha 
  94 <Ursinha>	okay
  95 <Ursinha>	+branches timeout has a fix already committed, and also that horrible 'specications' bug is fix committed as well
  96 <Ursinha>	so, two issues to ask: foundations and registry
  97 <Ursinha>	sinzui, I can see a lot of these ExpatErrors, that are bug 403606, does barry said something about fixing that?
  98 <Ursinha>	gary_poster, bug 410576 is Critical but I see there's no activity for almost a week now, is that really critical?
  99 <ubottu>	Launchpad bug 403606 in launchpad-registry "ExpatError errors should be handled to not generate the OOPSes" [High,Triaged] https://launchpad.net/bugs/403606
 100 <ubottu>	Launchpad bug 410576 in launchpad-foundations "Librarian-gc discovered file missing from disk" [Critical,Triaged] https://launchpad.net/bugs/410576
 101 <Ursinha>	(in this meantime, I'll check bug 413749
 102 <ubottu>	Bug 413749 on http://launchpad.net/bugs/413749 is private
 103 <Ursinha>	)
 104 *	JoaoSantana (n=joao@200.165.133.50) has joined #launchpad-meeting
 105 <gary_poster>	Ursinha: I believe it is high: afaik, the criticality is what mthaddon describes in his comments to that issue.  This is what stub is going to next.
 106 <sinzui>	Ursinha: barry has not provided any insight into the issue yet. I cannot estimate it
 107 <stub>	Its critical because it is part of the impending librarian collapse.
 108 <sinzui>	matsubara: bug #41648
 109 <mthaddon>	gary_poster: it's critical - LP will blow up in 20 days or so if it's not fixed (as the librarian will run out of space)
 110 <ubottu>	Launchpad bug 41648 in acpi "Sleep and hibernate fail on Acer Ferrari 3400" [Medium,Fix released] https://launchpad.net/bugs/41648
 111 <matsubara>	sinzui, hmm that doesn't look like a lp bug
 112 <sinzui>	matsubara: bug #416483
 113 <ubottu>	Launchpad bug 416483 in launchpad-registry "deletion of series and milestone must remove structural subscriptions" [High,Triaged] https://launchpad.net/bugs/416483
 114 <matsubara>	cool. thanks sinzui!
 115 <sinzui>	^ points the the related bug too
 116 <gary_poster>	mthaddon, stub: (procedural, apologies) what does critical mean then?  I thought it meant drop everything, while afaict this is a do it within 10 days?
 117 <Ursinha>	gary_poster, mthaddon, we have two bugs here, bug 410576 and bug 413749
 118 <ubottu>	Launchpad bug 410576 in launchpad-foundations "Librarian-gc discovered file missing from disk" [Critical,Triaged] https://launchpad.net/bugs/410576
 119 <ubottu>	Bug 413749 on http://launchpad.net/bugs/413749 is private
 120 <Ursinha>	gary_poster, that's my question as well
 121 <mthaddon>	gary_poster: I think if we know it's going to blow up all of LP in a short period of time, that's critical
 122 <gary_poster>	afaik 413749 is the (a?) symptom of 410576.  stub, mthaddon, can you please correct me?
 123 <stub>	gary_poster: It is my top priority, as we need to know the genuine rate of disk consuption for the librarian so we can accurately predict when new disk has to be purchased and installed by, or soyuz has to decrease their consumption by
 124 <gary_poster>	stub thank you
 125 <mthaddon>	gary_poster: it's related, but fixing the librarian-gc will buy us more time, not fix it forever
 126 <gary_poster>	ok, gotcha
 127 <gary_poster>	So Ursinha, it is critical, and we should be moving to in progress, at least, within a day or so.
 128 <Ursinha>	great gary_poster, thanks a lot
 129 <matsubara>	Ursinha, anything else re: oops and critical bugs?
 130 <Ursinha>	sinzui, could you poke barry again about that bug? I can do that as well if you want :)
 131 <sinzui>	I will
 132 <Ursinha>	thanks a lot sinzui
 133 <cprov>	stub: we have to adjust the removal of BPRs to be more aggressive.
 134 <danilos>	cprov: can you (i.e. Soyuz team) provide data flacoste asked for in https://bugs.edge.launchpad.net/launchpad-foundations/+bug/413749 so we've got raw numbers there as well?
 135 <ubottu>	Error: This bug is private
 136 <stub>	cprov: Bug 413749 has a soyuz task, so you may want to triage it.
 137 <cprov>	danilos: sure, I can try.
 138 <matsubara>	garbo-hourly failed on the 17th even after spm adjusted the check to 12 hours. stub do you know what's up?
 139 <mthaddon>	cprov: any idea of how much space that would buy us?
 140 <ubottu>	Bug 413749 on http://launchpad.net/bugs/413749 is private
 141 <stub>	matsubara: I wasn't aware of that.
 142 <cprov>	mthaddon: can't tell exactly, but I issue the queries for estimating few other scenarios than 1 month quarantine for BPR files
 143 <mthaddon>	ok
 144 <matsubara>	there's a "Scripts failed to run: loganberry:garbo-hourly" email sent to the list on the 17th. could you investigate and reply to that email?
 145 <matsubara>	stub, ^
 146 <Ursinha>	cprov, can you follow up later on that bug then, please?
 147 <cprov>	Ursinha: sure
 148 <Ursinha>	thanks cprov
 149 <matsubara>	[action] cprov to follow up on bug 413749
 150 <MootBot>	ACTION received:  cprov to follow up on bug 413749
 151 <ubottu>	Bug 413749 on http://launchpad.net/bugs/413749 is private
 152 <ubottu>	Bug 413749 on http://launchpad.net/bugs/413749 is private
 153 <matsubara>	[action] stub to investigate garbo-hourly failure after spm adjusted script checking to 12h
 154 <MootBot>	ACTION received:  stub to investigate garbo-hourly failure after spm adjusted script checking to 12h
 155 <matsubara>	[action] sinzui to poke barry about ExpatError OOPSes (bug 403606)
 156 <MootBot>	ACTION received:  sinzui to poke barry about ExpatError OOPSes (bug 403606)
 157 <ubottu>	Launchpad bug 403606 in launchpad-registry "ExpatError errors should be handled to not generate the OOPSes" [High,Triaged] https://launchpad.net/bugs/403606
 158 <sinzui>	done
 159 *	sinzui eagerly awaits an assessment
 160 <matsubara>	cool
 161 <matsubara>	I think that's all for this section
 162 <matsubara>	thanks everyone
 163 <Ursinha>	thanks a bunch sinzui
 164 <Ursinha>	and everyone else :)
 165 <Ursinha>	do ahead matsubara
 166 <Ursinha>	*go
 167 <matsubara>	[TOPIC] * Operations report (mthaddon/Chex/spm)
 168 <MootBot>	New Topic:  * Operations report (mthaddon/Chex/spm)
 169 <danilos>	mbarnett for the agenda as well? :)
 170 <mthaddon>	:)
 171 <Chex>	- Buildbot now hosted from the DC
 172 <Chex>	 - Multiple Cherry Picks this past week
 173 <Chex>	 - Will be beginning to implement recommendations from SplitIt Sprint before too long
 174 <Chex>	 - Codebrowse needed restarting more than usual this week (see IncidentLog)
 175 <Chex>	 - Incident with edge rollout breaking as one app server refused to stop, and interaction with the session DB being trashed - see Incident Report and most likely discussed earlier in the meeting
 176 <Chex>	 - LOSA sprint this week to get new LOSAs (Chex, mbarnett) up to speed
 177 <matsubara>	danilos, good catch. thanks
 178 <Chex>	and thats it for us, unless there are any questions??
 179 <gary_poster>	yay buildbot in DC! :-)
 180 <danilos>	yeah, great stuff, looking forward to everything else that enables :)
 181 <danilos>	(like the production branch in buildbot *grin*)
 182 <matsubara>	thanks Chex 
 183 <matsubara>	[TOPIC] * DBA report (stub)
 184 <MootBot>	New Topic:  * DBA report (stub)
 185 <stub>	Our disk usage is going steadily up. Nothing alarming yet, but it did prompt me to turn on the long-running-transaction killer. Non-system transactions running over 3 hours will now be killed. This should alleviate database bloat, which adversely affects everything. It will also stop processes that block on long running transactions from blocking too long (like the garbo).
 186 <stub>	I've bumped up the default statistics target to 250. We have twice over the last several months had a query chewing up huge amounts of disk space in temporary tables, and my best guess as to why is bad query plans. The higher statistics target should make this less likely.
 187 <stub>	Done.
 188 *	Ursinha misses the oot thing
 189 <Ursinha>	questions for stub?
 190 <danilos>	stub: ok, so that means that fixing langpack exporter is now critical for us, right?
 191 <stub>	danilos: I can turn it off if necessary. I'm not sure what effect is has on the langpack export.
 192 <stub>	Will all of them be affected?
 193 <stub>	oot
 194 <danilos>	stub: most of the runs will
 195 <Ursinha>	hehe
 196 <danilos>	stub: I've made it critical for us, it should be a simple fix, it'll only require cherrypicking
 197 <stub>	danilos: ok. I'd like that issue raised to high or critical. I'll turn the check to 8 hours which will cover the current longest transaction I'm seeing in the graphs.
 198 <stub>	k
 199 <danilos>	stub: it was high and scheduled for 3.0, now it's scheduled for asap :)
 200 <stub>	Please add a note to the CP request that the limit needs to be put back.
 201 <danilos>	stub: sure, thanks
 202 <Ursinha>	thanks stub
 203 <Ursinha>	and danilos
 204 <matsubara>	thanks stub and danilos 
 205 <stub>	danilos: Bug number?
 206 <danilos>	stub: bug 411697
 207 <ubottu>	Launchpad bug 411697 in rosetta "Language pack export has very long running transactions" [Critical,Triaged] https://launchpad.net/bugs/411697
 208 <matsubara>	* In-team handling of OOPSes (Danilo) 
 209 <danilos>	ok, a long paste follows
 210 *	matsubara hands the mic to danilos 
 211 <danilos>	Breaking news from the team leads call!  Read all about it!
 212 <danilos>	Many of the duties Diogo and Ursula had you spoiled with (like trawling OOPS summaries and error logs and matching/filing relevant bugs) is what QA contacts in each team should do (generally, it was considered that this is what they should have been doing anyway).
 213 <danilos>	According to Gary, Diogo is happy to continue maintaining oops-tools (and relevant infrastructure, which will stay in Foundations turf), but everybody else is invited to contribute and take interest in the tools if they want something added.
 214 <ubottu>	https://lp-oops.canonical.com/oops.py/?oopsid=tools
 215 <danilos>	Similarly, if someone finds it hard to go through numerous places to see all the possible problems (i.e. going through several OOPS summaries, error-reports list, etc), they are welcome to improve our infrastructure for aggregating these.
 216 <danilos>	I am personally hoping that once we pick a release manager for 3.0, (s)he'll take care that all QA contacts are on top of their game. Perhaps we can have Ursula and Diogo continue as is until RM for 3.0 is appointed.
 217 <danilos>	Any suggestions on what should change in the format of the meeting to make sure this is not a regression compared to what we do today?
 218 <gary_poster>	(eh, that summary came out in such a way that I feel I should have talked with matsubara first.  sorry, matsubara, and feel free to correct the summary about your personal position)
 219 <matsubara>	gary_poster, it's correct :-)
 220 <danilos>	gary_poster: (I was just being careful not to put words in matsubara's mouth, I should have talked to him first, but there just wasn't the time between the teamleads call and this meeting :)
 221 *	gmb (n=gmb@i-83-67-31-25.freedom2surf.net) has joined #launchpad-meeting
 222 <gary_poster>	cool :-)
 223 <danilos>	anyway, how should the meetings be run from now on? matsubara, you want to keep running them?
 224 <gary_poster>	+1 if you are willing matsubara
 225 <matsubara>	danilos, yes, I talked to francis about it and Ursinha and I will still run the production meeting
 226 <cprov>	+1
 227 <danilos>	anybody else has any comments? everybody, this means more work for you and less for matsubara, Ursinha :)
 228 <Ursinha>	+1 from me
 229 <stub>	How to teams claim an oops? The benefit of a central monitor and this meeting is when teams disagree on who the problem belongs too.
 230 <matsubara>	but it'd be nice to have help from the QA contacts doing the daily oops analysis and help with triage
 231 <danilos>	stub: that's for the release manager to worry about IMO, but in general, we should be having bug attached to all the OOPSes
 232 <stub>	Who creates the bugs?
 233 <Ursinha>	danilos, that's the idea
 234 <Ursinha>	stub, it depends
 235 <Ursinha>	stub, for instance, afaik, translations has been creating its own bugs for some time now
 236 <Ursinha>	checking the summaries daily
 237 <danilos>	stub: in general, we might be able to improve tools to split summaries by vhost initially
 238 <stub>	I'm just wondering how we avoid them being dropped on the floor because, say, translations thinks an oops is a foundations issue and vice versa.
 239 <Ursinha>	danilos, matsubara has the idea of using page ids
 240 <Ursinha>	for splitting
 241 <Ursinha>	*had
 242 <danilos>	Ursinha: right, that might be a good one as well
 243 <stub>	splitting the reports into areas of responsibility would address my concern I think.
 244 <danilos>	Ursinha: actually, it's perfect
 245 <cprov>	okay, running the risk to sound like an idiot,  who are the current QA contacts ? TLs ?
 246 <Ursinha>	cprov, the people that attend this meeting
 247 <stub>	TLs until they delegate ;)
 248 <matsubara>	cprov, everyone who attend this meeting weekly
 249 <danilos>	cprov: it means it's you! :)
 250 <matsubara>	cprov, actually it's bigjools, but he's away today
 251 <cprov>	fantastic! thanks.
 252 <Ursinha>	danilos, :P, bigjools actually
 253 <danilos>	heh, ok... in general, I think this is best done by a team lead
 254 <danilos>	(and soon enough, I'll be replacing henninge as the translations QA contact)
 255 <Ursinha>	danilos, it was TL's call when they pointed the QA contacts
 256 <danilos>	Ursinha: I know
 257 <gary_poster>	hm.  question.  if we *all* trawl oops, is that a collective time loss?
 258 <Ursinha>	but that can be changed for this new experiment
 259 <Ursinha>	gary_poster, if we separate per teams, not that much
 260 <Ursinha>	I believe
 261 <danilos>	so, matsubara, can we have an action for me to discuss with Ursinha and you how we can split OOPS reports into per-team summaries?
 262 <Ursinha>	per "teams"
 263 <gary_poster>	oh I see
 264 <danilos>	gary_poster: right, see above
 265 <gary_poster>	ok thanks
 266 <matsubara>	[action] danilos, Ursinha and matsubara to discuss oops summaries split per team
 267 <MootBot>	ACTION received:  danilos, Ursinha and matsubara to discuss oops summaries split per team
 268 <danilos>	matsubara: thanks
 269 <Ursinha>	gary_poster, we in fact have a new feature on oops-tools that associate a bug to a exception type (matsubara correct me if I'm wrong here)
 270 <ubottu>	https://lp-oops.canonical.com/oops.py/?oopsid=tools
 271 <Ursinha>	this helps a lot
 272 <danilos>	ubottu: thanks for nothing (just so you don't get used to praise only)
 273 <ubottu>	Error: I am only a bot, please don't think I'm intelligent :)
 274 <Ursinha>	sometimes you freak me out ubottu
 275 <Ursinha>	anyway
 276 <Ursinha>	:)
 277 <danilos>	anyway, that's all settled afaiac
 278 <matsubara>	gary_poster, Ursinha: now we have a feature on oops-tools that once an oops is linked to a bug, subsequent oopses of that same type are already linked to the bug report
 279 <ubottu>	https://lp-oops.canonical.com/oops.py/?oopsid=tools
 280 <Ursinha>	gary_poster, if you click the oops, most of them have a bug associated, on top left
 281 <danilos>	we'll be reporting back, everything stays as is until we've got better oops reports, but do expect changes soon
 282 <matsubara>	makes analysis much easier
 283 <Ursinha>	bug report?
 284 <matsubara>	next step is to add that info to the summary
 285 <gary_poster>	heh.  ah I see cool
 286 <gary_poster>	thanks Ursinha, matsubara
 287 <matsubara>	all right. thanks danilos for bringing this up
 288 <Ursinha>	ah, I got that
 289 <matsubara>	and thanks everyone
 290 <Ursinha>	thanks everyone
 291 <matsubara>	Thank you all for attending this week's Launchpad Production Meeting. See https://dev.launchpad.net/MeetingAgenda for the logs. 
 292 <matsubara>	#endmeeting 

DevelopmentMeeting20090820 (last edited 2009-08-21 13:42:32 by matsubara)