## Template for LP Production Meeting logs. Just paste xchat log below and the format IRC line will take care of formatting correctly #format IRC #startmeeting Meeting started at 09:04. The chair is matsubara. Commands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE] Welcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues. me [TOPIC] Roll Call New Topic: Roll Call me sorry for being late, I was on my stand up meeting stub, Chex, gary_poster, rockstar, bigjools, danilo_: Hi matsubara: hi me matsubara: TLs are in a call.... :/ noodles775: can you cover me please? bigjools, right, by the end of this meeting sinzui will propose a change for the production meeting so it won't clash anymore bigjools: OK. ta bigjools, we've got an agenda item to move the call, so you should be in for at least that ok, so let's move on. Chex, rockstar and stub can join in later [TOPIC] Agenda * Actions from last meeting * Oops report & Critical Bugs & Broken scripts * Operations report (mthaddon/Chex/spm/mbarnett) * DBA report (stub) * Proposed items New Topic: Agenda * stub (n=stub@canonical/launchpad/stub) has joined #launchpad-meeting [TOPIC] * Actions from last meeting New Topic: * Actions from last meeting * Ursinha to send one email to lp list explaining the qa-tags experiment * matsubara to chase someone from code team about bug 480000 * matsubara to chase code people about code script failures (create-merge-proposals, branch puller and update branches) * matsubara to ask someone from code about bug 485318 * emailed tim about these * Chex to follow up with thumper about the multiple git import failures on the importd * matsubara to file a bug for OOPS-1420ED1047 * there was a bug filed for this already. Bug 484368 * sinzui to investigate failure on the mirror prober (The script 'distributionmirror-prober' didn't run on 'loganberry' between 2009-11-23 06:07:04 and 2009-11-23 12:07:04 (last seen 2009-11-23 04:54:10.444057)) * matsubara to ask gary about python2.5 update and get back to losas * francis emailed the list and gary about this * matsubara to ask stub to contact losas about load increase on wildcherry * emailed stub about this one Launchpad bug 480000 in launchpad-code "OOPS deleting a branch" [Low,Triaged] https://launchpad.net/bugs/480000 Launchpad bug 485318 in launchpad-code "POSTToNonCanonicalURL error using bazaar client" [Wishlist,Triaged] https://launchpad.net/bugs/485318 https://lp-oops.canonical.com/oops.py/?oopsid=1420ED1047 Launchpad bug 484368 in rosetta "LocationError: 'top_projects_and_packages_to_translate'" [High,Triaged] https://launchpad.net/bugs/484368 I don't recall an email explaining the qa-tags experiment but Ursula did show up a wiki page for me before leaving on vacation matsubara: the script was dealyed, ran fine later. The same thing happened to the PRF. it is running fine danilo_, do you know if that email was done? sinzui, thanks for checking s/done/sent/ * salgado is now known as salgado-lunch mthaddon, around? I guess people are too busy with other stuff let's move on [action] * Ursinha to send one email to lp list explaining the qa-tags experiment ACTION received: * Ursinha to send one email to lp list explaining the qa-tags experiment [action] * Chex to follow up with thumper about the multiple git import failures on the importd ACTION received: * Chex to follow up with thumper about the multiple git import failures on the importd [TOPIC] * Oops report & Critical Bugs & Broken scripts New Topic: * Oops report & Critical Bugs & Broken scripts danilo_, is https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1425EA795 and https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1429EB593 related to bug https://bugs.edge.launchpad.net/rosetta/+bug/484368? https://lp-oops.canonical.com/oops.py/?oopsid=1425EA795 https://lp-oops.canonical.com/oops.py/?oopsid=1429EB593 Ubuntu bug 484368 in rosetta "LocationError: 'top_projects_and_packages_to_translate'" [High,Triaged] sinzui, https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1430F2574 https://lp-oops.canonical.com/oops.py/?oopsid=1430F2574 sinzui, is this registry or foundations? foundations matsubara: high/critical matsubara: This may be caused by the replication delay all right. I'll file a bug for it and ask gary_poster to take a look [action] matsubara to file a high/critical bug for OOPS-1430F2574 ACTION received: matsubara to file a high/critical bug for OOPS-1430F2574 https://lp-oops.canonical.com/oops.py/?oopsid=1430F2574 https://lp-oops.canonical.com/oops.py/?oopsid=1430F2574 rockstar, https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1427EA45 I've seen 4 occurrences of this oops last week, is this a known issue? some bad data? worth a bug? https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1426EC1536 shouldn't this one be a NotFound rather than a NotFoundError? https://lp-oops.canonical.com/oops.py/?oopsid=1427EA45 https://lp-oops.canonical.com/oops.py/?oopsid=1426EC1536 I guess I'll have to email tim about those as well OOPS-1430F2574 : I agree that this is probably replication https://lp-oops.canonical.com/oops.py/?oopsid=1430F2574 [action] matsubara to email tim about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1427EA45 I've seen 4 occurrences of this oops last week, is this a known issue? some bad data? worth a bug? https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1426EC1536 shouldn't this one be a NotFound rather than a NotFoundError? ACTION received: matsubara to email tim about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1427EA45 I've seen 4 occurrences of this oops last week, is this a known issue? some bad data? worth a bug? https://lp-oops.canonical.com/oops.py/?oopsid=1427EA45 https://lp-oops.canonical.com/oops.py/?oopsid=1426EC1536 https://lp-oops.canonical.com/oops.py/?oopsid=1427EA45 damn [action] matsubara to email tim about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1426EC1536 shouldn't this one be a NotFound rather than a NotFoundError? ACTION received: matsubara to email tim about https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1426EC1536 shouldn't this one be a NotFound rather than a NotFoundError? https://lp-oops.canonical.com/oops.py/?oopsid=1426EC1536 https://lp-oops.canonical.com/oops.py/?oopsid=1426EC1536 we have 5 critical bugs, 4 of them fix committed and 1 in progress so all good in that area no script failures since last week (well, only PRF but that's fine per sinzui) [TOPIC] * Operations report (mthaddon/Chex/spm/mbarnett) New Topic: * Operations report (mthaddon/Chex/spm/mbarnett) let's move to the next topic as there's no losa around [TOPIC] * DBA report (stub) New Topic: * DBA report (stub) We have had two incidents where appserver requests have sent the load on the main database server over 100 in some sort of a feedback loop we dubbed the DB Death Spiral. We think we tracked down the trigger - the page the load balancers used to detect if Launchpad is up accessed the session database, and our session machinery becomes a bottleneck under load. What we hope is the immediate fix lands tomorrow - stopping that page from accessing the database. I have plans of offloading the bulk of the session machinery work to memcache so it should stop becoming a bottleneck under load, but that is work for the next cycle or two. We also managed to have replication issues, because when it rains it pours. Both times where do do with adding a new replica into the cluster. The first time, it turned out some events where left around that should have been cleared up causing conflicts. So when one of our replicas tried to confirm it had seen an event, it found the confirmation was already there so it aborted. The second one, today, removing the replica from the cluster hadn't quite succeeded so replication lag on the cluster was increasing. This wasn't noticed or was ignored, and we attempted to re-add the database back into a heavily lagged cluster. This needed recovering. I don't think users where affected today. * danilo_ has quit (Read error: 110 (Connection timed out)) And that is all I've typed so far ;) stub, should I expect to see lots of OOPSes in the reports about this replication lag issue? I've got a bug open to add some more safety belts to our helpers to catch these cases. matsubara: Hopefully not. I'm not sure though. stub, all right. I'll let you know if spot anything thanks stub [TOPIC] * Proposed items New Topic: * Proposed items # Move the production meeting one hour later to avoid clash with other meetings (sinzui) please I am in another meeting right now I'm +1 on the change. it'd be actually better for me to have the meeting at 16UTC how about the others? I'm assuming that bigjools is also +1 for the same reason. and danilo too +1 +1 That is getting nuts for me, but I can do the report by email just as easily as typing it up here. on the other hand, what do you think about not having this meeting at all anymore? do you think it's useful or the format could be changed? I see lots of people missing this meeting or not paying much attention... stub, your section and the losas section are the ones that interest me the most :-) stub, reports by email are fine by me, not sure about others I tend to think email would be a better forum rather than playing Chinese whispers. yeah [action] matsubara to talk to TL about not having the LP production meeting anymore or change its format ACTION received: matsubara to talk to TL about not having the LP production meeting anymore or change its format and for the next one, let's try to have it at 16UTC. I'll email the QA contacts to let everyone know. [action] matsubara to email Qa contacts about next LP prod. meeting at 16UTC ACTION received: matsubara to email Qa contacts about next LP prod. meeting at 16UTC At that hour, it will be drunk from a gogo bar :) [action] matsubara to email losas about their weekly report ACTION received: matsubara to email losas about their weekly report hehe and I think that's all for today Thank you all for attending this week's Launchpad Production Meeting. See https://dev.launchpad.net/MeetingAgenda for the logs. Thanks matsubara #endmeeting Meeting finished at 09:34.