DevelopmentMeeting20091022

Not logged in - Log In / Register

   1 <matsubara> #startmeeting
   2 <MootBot> Meeting started at 10:00. The chair is matsubara.
   3 <MootBot> Commands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE]
   4 <gary-sprint> me
   5 <gary-sprint> :-D
   6 <matsubara> Welcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues. 
   7 <matsubara> [TOPIC] Roll Call 
   8 <MootBot> New Topic:  Roll Call
   9 <sinzui> me
  10 <gary-sprint> uh, me
  11 <Ursinha> me
  12 <matsubara> rockstar, Chex, bigjools, allenap: hi
  13 <allenap> me
  14 <matsubara> apologies from stub
  15 <rockstar> me
  16 <mthaddon> me
  17 <bigjools> me
  18 <matsubara> [TOPIC] Agenda 
  19 <MootBot> New Topic:  Agenda
  20 <mthaddon> matsubara: I'm sitting in for Chex this meeting as he's working on U1 stuff
  21 <matsubara>  * Actions from last meeting
  22 <matsubara>  * Oops report & Critical Bugs & Broken scripts
  23 <matsubara>  * Operations report (mthaddon/Chex/spm/mbarnett)
  24 <matsubara>  * DBA report (stub)
  25 <matsubara>  * Proposed items
  26 <matsubara> thanks mthaddon 
  27 <matsubara> [TOPIC] * Actions from last meeting
  28 <MootBot> New Topic:  * Actions from last meeting
  29 <matsubara> * matsubara to file a bug on oops-tools to recognize new oops prefixes and sort out conflicting prefixes with losas
  30 <matsubara> * Chex to check app server logs and apache logs to see if it can shed any light in the high load issue.
  31 <matsubara> * adeuring to check with gmb about checkwatches failure
  32 <matsubara> * danilos to check bug 438039, assess if it's really critical. if it's is, land a fix, if it's not, update the importance
  33 <matsubara> * bigjools to investigate update-cache failure and reply back to the list
  34 <ubottu> Launchpad bug 438039 in rosetta "bzr branch import script oopses sometimes" [Critical,Fix released] https://launchpad.net/bugs/438039
  35 <danilos> matsubara: the bug tells you what it was :)
  36 <danilos> oh, I forgot to 'me' myself
  37 <matsubara> I'll finish up my action today 
  38 <matsubara> thanks danilos 
  39 <bigjools> I chatted to curtis and as far as wecan tell it was caused by something else holding a transaction/table open
  40 <bigjools> not much I can do
  41 <matsubara> gmb replied to checkwatches failure email. it was a hung process which was killed and service resumed 
  42 <sinzui> Since the PRF ran the following days, I believe it was a long running process that worried our watching proc
  43 <matsubara> bigjools, thanks for checking. I don't see new emails from that script failing so I take it's working normally
  44 <bigjools> yep
  45 <matsubara> mthaddon, any luck investigating the high loading issue?
  46 <matsubara> s/loading/load/
  47 <mthaddon> matsubara: I wasn't aware that was something we were following up on - not sure what the latest is, but I guess part of it plays into the new SplitIt stuff
  48 <mthaddon> i.e. we've just brought a whole bunch of new servers online so we need to see what effect this has on the overall load of the system
  49 <matsubara> all right. I'll take that item off the list and if high load shows up in the graphs we can pursue further
  50 <mthaddon> k
  51 <matsubara> thanks all, moving on
  52 <matsubara> [TOPIC] * Oops report & Critical Bugs & Broken scripts
  53 <MootBot> New Topic:  * Oops report & Critical Bugs & Broken scripts
  54 <Ursinha> it'sme
  55 <Ursinha> gary_poster, bug 331990, can we CP it?
  56 * sinzui stares at Ursinha
  57 <ubottu> Launchpad bug 331990 in launchpad-foundations "The inline editor widget reports a JSON error when saving non-ASCII characters" [High,Fix committed] https://launchpad.net/bugs/331990
  58 <Ursinha> s/gary_poster/gary-sprint/
  59 <gary-sprint> Ursinha: I do not have CP-foo.
  60 <Ursinha> allenap, can we have a fix for bug 438802 and maybe CP it?
  61 <matsubara> gary-sprint, is this a matter of updating the lazr.restful lib used by lpnet?
  62 <Ursinha> allenap, also, we have bug 438985, it's in progress but without activity for a some time
  63 <ubottu> Launchpad bug 438802 in malone "UnicodeDecodeError changing 'Assigned to' field when summary contains non-ascii" [High,Triaged] https://launchpad.net/bugs/438802
  64 <Ursinha> allenap, and bug 458180, that's BugTask index timeouts
  65 <ubottu> Launchpad bug 438985 in malone "Trying to make myself as bug supervisor of my project oopses" [High,In progress] https://launchpad.net/bugs/438985
  66 <ubottu> Launchpad bug 458180 in malone "BugTask:+index timing out" [High,Triaged] https://launchpad.net/bugs/458180
  67 <Ursinha> sinzui, I've filed bug 458169 and bug 458189, the timeouts on Milestone and DistroSeries index pages
  68 <Ursinha> rockstar, can we have a fix for bug 442981?
  69 <ubottu> Launchpad bug 458169 in launchpad-registry "Distroseries:+index page timing out" [High,Triaged] https://launchpad.net/bugs/458169
  70 <ubottu> Launchpad bug 458189 in launchpad-registry "Milestone:+index pages timing out" [Undecided,New] https://launchpad.net/bugs/458189
  71 <ubottu> Launchpad bug 442981 in launchpad-code "launchpad-project/+activereviews is OOPSing with TypeError (dup-of: 457541)" [High,Triaged] https://launchpad.net/bugs/442981
  72 <ubottu> Launchpad bug 457541 in launchpad-code "Active code reviews for Loggerhead OOPSes on edge" [High,Fix released] https://launchpad.net/bugs/457541
  73 <gary-sprint> Ursinha: maybe I misunderstood.  are you asking for CP-blessing or for CP-shepherding?  If the latter, sure, we can shepherd.
  74 <Ursinha> gary-sprint, shepherding
  75 <sinzui> Ursinha: I replied that I beleive they are dups of 455812
  76 <sinzui> brad is already working on it
  77 <Ursinha> bug 455812
  78 <ubottu> Launchpad bug 455812 in launchpad-registry "distroseries milestone timeout" [High,Triaged] https://launchpad.net/bugs/455812
  79 <Ursinha> hmmm
  80 <gary-sprint> matsubara: not sure, will ask leonardr.
  81 <Ursinha> sinzui,I'll mark it as a dupe then, thanks
  82 <sinzui> not yet
  83 <rockstar> Ursinha, the fix for that bug is closing it as a duplicate of bug 457541
  84 <ubottu> Launchpad bug 457541 in launchpad-code "Active code reviews for Loggerhead OOPSes on edge" [High,Fix released] https://launchpad.net/bugs/457541
  85 <Ursinha> oh
  86 <matsubara> [action] gary to talk to leonardr about cherry picking lazr.restful updates on lpnet for bug 331990
  87 <rockstar> Ursinha, also, that bug is Fix Released.
  88 <MootBot> ACTION received:  gary to talk to leonardr about cherry picking lazr.restful updates on lpnet for bug 331990
  89 <ubottu> Launchpad bug 331990 in launchpad-foundations "The inline editor widget reports a JSON error when saving non-ASCII characters" [High,Fix committed] https://launchpad.net/bugs/331990
  90 <gary-sprint> +1
  91 <Ursinha> rockstar,it still happens, how come?
  92 <rockstar> Ursinha, so yes, you may have it before it was asked.
  93 <sinzui> I have assign the distroseries +index to edwin. I think EdwinGrubbs and bac will find this is the same problem
  94 <sinzui> The oopses of the two new bugs look the the oopses I have been tracking in the older bug
  95 <rockstar> Ursinha, doesn't oops for me.
  96 <Ursinha> rockstar, so the summaries are lying :)
  97 <rockstar> Ursinha, does this url oops for you? https://edge.launchpad.net/launchpad-project/+activereviews
  98 <Ursinha> rockstar, well, it's loading... I'll keep my eye on it and if needed reopen it, right?
  99 <Ursinha> rockstar, thanks
 100 <Ursinha> allenap, hi :)
 101 <allenap> Ursinha: I talk to deryck about getting bug 438802 fixed, and gmb about bug 438985.
 102 <ubottu> Launchpad bug 438802 in malone "UnicodeDecodeError changing 'Assigned to' field when summary contains non-ascii" [High,Triaged] https://launchpad.net/bugs/438802
 103 <ubottu> Launchpad bug 438985 in malone "Trying to make myself as bug supervisor of my project oopses" [High,In progress] https://launchpad.net/bugs/438985
 104 <Ursinha> allenap, thanks
 105 <allenap> Ursinha: Bug 458180 is a perennial problem.
 106 <ubottu> Launchpad bug 458180 in malone "BugTask:+index timing out" [High,Triaged] https://launchpad.net/bugs/458180
 107 <Ursinha> allenap, I see the main offender is bug #1
 108 <ubottu> https://bugs.launchpad.net/ubuntu/+bug/1 (Timeout)
 109 <Ursinha> yes, *sigh*
 110 <allenap> Ursinha: Yeah, it always is :)
 111 <rockstar> Ursinha, yes, but I can't see how it'd get reopened.  It was bad data, we fixededed the database records.
 112 <danilos> Ursinha: and you just made it worse with a reference now :)
 113 <Ursinha> danilos, yes, just to prove my point :P
 114 <matsubara> allenap, it still happens in other bugs too as per jono's email to launchpad-dev about ubuttu timing out
 115 <Ursinha> allenap, there are some oopses not caused by #1
 116 <Ursinha> thanks allenap
 117 <allenap> matsubara: Okay, as someone said, perhaps it's the +text interface.
 118 <Ursinha> gary-sprint, the "buildbot failure in Launchpad on jscheck", is it severe?
 119 <matsubara> allenap, I briefly trawled the summaries and there are a some sofr timeouts on +text, but soft timeouts shouldn't be affecting ubottu
 120 <allenap> matsubara, Ursinha: We need to do something more drastic to get the bug page quicker I think. Caching, etc, and that's coming alone. We've done a lot of the other things we can think of, but I'll discuss it with the team.
 121 <allenap> matsubara: That's interesting.
 122 <allenap> s/alone/along/
 123 <Ursinha> I see some emails from francis and rockstar talking about it, is there something that can or needs to be done?
 124 <matsubara> allenap, perhaps those timeouts are not being logged as OOPSes? similar to 500 we see eventually from apache
 125 <Ursinha> gary-sprint, ^
 126 <matsubara> to the 500 errors I mean
 127 <rockstar> Ursinha, gary-sprint, it is my belief that windmill sucks.
 128 <gary-sprint> Ursinha: it does not appear to be a problem in the basic buildbot setup at the moment.    There are failures in the tests.  This doesn't seem to be a foundations issue AFAICT.  Björn may very well be able to help when he returns
 129 <allenap> matsubara: Okay, I'm not sure what you mean, but we can talk about it after the meeting.
 130 <Ursinha> right gary-sprint, thanks for the info
 131 <matsubara> allenap, sure thing. I'll find the bug I'm referring to
 132 <allenap> matsubara: Thanks.
 133 <matsubara> [action] allenap and matsubara to talk about the timeouts on bug pages
 134 <MootBot> ACTION received:  allenap and matsubara to talk about the timeouts on bug pages
 135 <Ursinha> right, I'm done here
 136 <gary-sprint> rockstar: that's probably a given.  The more interesting question is whether it sucks worse than the alternatives.  My impression is no, but a champion could fight for an alternate view,
 137 <gary-sprint> .
 138 <Ursinha> allenap, is it possible to ask for a cp for bug 438802 when it's fixed?
 139 <ubottu> Launchpad bug 438802 in malone "UnicodeDecodeError changing 'Assigned to' field when summary contains non-ascii" [High,Triaged] https://launchpad.net/bugs/438802
 140 <rockstar> gary-sprint, sadly, there is no better alternative. Windmill sucks less than anything else out there.
 141 * salgado is now known as salgado-lunch
 142 <gary-sprint> :-)
 143 <allenap> Ursinha: Sure.
 144 <Ursinha> anmar was having problems yesterday with bugs with chinese chars, I think it's worth doing a CP
 145 <Ursinha> thanks allenap
 146 <allenap> Ursinha: np, thank you :)
 147 <Ursinha> :)
 148 <matsubara> ok, two fix committed critical bugs 
 149 <matsubara> rockstar, we had some failures on the update_preview_diffs script
 150 <matsubara> on the 19th
 151 <rockstar> matsubara, yeah, we're currently in the process of fixing the various oopses that script creates.
 152 * thumper has quit (Remote closed the connection)
 153 <matsubara> rockstar, ok. can you give me the bug numbers after the meeting?
 154 <rockstar> matsubara, there are many.
 155 <matsubara> gladly we have an oops tag to filter those :-)
 156 <matsubara> rockstar, I'll ping you after the meeting
 157 <Ursinha> thanks everyone
 158 <matsubara> I think that's all for this section. thanks everyone
 159 <matsubara> [TOPIC] * Operations report (mthaddon/Chex/spm/mbarnett)
 160 <MootBot> New Topic:  * Operations report (mthaddon/Chex/spm/mbarnett)
 161 <mthaddon> SplitIt is the big one this week - now complete with exception of Auth DB split.
 162 <mthaddon> New App servers brought online after haproxy throttlng of connections, we're watching how things are progressing
 163 <mthaddon> A number of CPs done this week
 164 <mthaddon> Is everyone clear on the new CP process?
 165 <mthaddon> Shipit now managed by ISD, and CPs to be approved by nigelp
 166 <mthaddon> Some app servers dying, loggerhead dying, poppy died once - is there a process for reviewing the Incident Log?
 167 <mthaddon> That's about it
 168 <matsubara> mthaddon, last I heard Francis was the one to champion the Incident Log process.
 169 <mthaddon> matsubara: basically we want to be sure someone's reviewing it to look for operational trends in production
 170 <bigjools> did he mention making trvial wiki edits for codebounce so we don't get email for those?
 171 <matsubara> ideally we won't need that codebounce all the time :-)
 172 <Ursinha> matsubara, +1 :)
 173 <mthaddon> bigjools: if we have to get alerts and go through the whole restart, edit wiki nightmare, you can put up with a few wiki edit notifications :)
 174 * matsubara looks at rockstar 
 175 <bigjools> mthaddon: well, no I don't :)
 176 <danilos> mthaddon: the concern is that we may learn to ignore it unless we can filter stuff out *we* can't do anything about
 177 <rockstar> mthaddon, I'm subscribed and get the pleasure of seeing every time you restart loggerhead.
 178 <matsubara> any news about the codebrowser dying all the time?
 179 <bigjools> what danilos said
 180 <danilos> mthaddon: specifically, translations or soyuz team can't help much with codebrowse restarts
 181 <rockstar> matsubara, we are bringing someone on to look into the codebrowse issue.  That's all I know.  We certainly don't have the bandwidth currently to do it.
 182 <mthaddon> fwiw, I usually do trivial that one - I guess maybe the other losas don't - will mention it
 183 <danilos> bigjools: just as an example, and these are very, very common
 184 <sinzui> I believe there is a plan to but people to work on loggerhead
 185 <danilos> mthaddon: in general, anything else shoudn't be a trivial edit, and codebrowse should, that would help old men like bigjools deal with their email :)
 186 <bigjools> ha
 187 <danilos> sinzui: yeah, as flacoste mentioned today, I think we are having a contract that starts today or tomorrow
 188 <matsubara> that's great news
 189 <danilos> mthaddon: but, do note that most team leads are subscribed to LPIncidentLog, and if one isn't, feel free to poke them about it
 190 <mthaddon> danilos: k, thx
 191 <matsubara> that's all mthaddon ?
 192 <mthaddon> yep
 193 <matsubara> all right. thanks everyone
 194 <matsubara> [TOPIC] * DBA report (stub)
 195 <MootBot> New Topic:  * DBA report (stub)
 196 <matsubara> stub is on vacation and looks like the db is fine
 197 <matsubara> AFAICT
 198 <matsubara> so let's move on.
 199 <matsubara> [action] matsubara to talk to stub about the DBA report when he gets back
 200 <MootBot> ACTION received:  matsubara to talk to stub about the DBA report when he gets back
 201 <matsubara> [TOPIC] * Proposed items
 202 <MootBot> New Topic:  * Proposed items
 203 <matsubara> no new proposed items
 204 <matsubara> and I think that's all for today
 205 <matsubara> Thank you all for attending this week's Launchpad Production Meeting. See https://dev.launchpad.net/MeetingAgenda for the logs. 
 206 <matsubara> #endmeeting 

DevelopmentMeeting20091022 (last edited 2009-10-22 16:01:30 by matsubara)