DevelopmentMeeting20090409

Not logged in - Log In / Register

   1 <matsubara> #startmeeting
   2 <MootBot> Meeting started at 10:00. The chair is matsubara.
   3 <MootBot> Commands Available: [TOPIC], [IDEA], [ACTION], [AGREED], [LINK], [VOTE]
   4 <matsubara> Welcome to this week's Launchpad Production Meeting. For the next 45 minutes or so, we'll be coordinating the resolution of specific Launchpad bugs and issues. 
   5 <matsubara> [TOPIC] Roll Call 
   6 <MootBot> New Topic:  Roll Call
   7 <sinzui> me
   8 <matsubara> Not on the Launchpad Dev team? Welcome! Come "me" with the rest of us! 
   9 <Ursinha> me
  10 <herb> me
  11 <intellectronica> me
  12 <matsubara> rockstar, hi
  13 <matsubara> al-maisan, hi
  14 * stub (n=stub@canonical/launchpad/stub) has joined #launchpad-meeting
  15 <matsubara> flacoste, hi
  16 <danilos> me (so-so)
  17 <matsubara> ok, foundations, code and soyuz missing. they can join in later
  18 <matsubara> [TOPIC] Agenda 
  19 <MootBot> New Topic:  Agenda
  20 <matsubara>  * Actions from last meeting
  21 <matsubara>  * Oops report & Critical Bugs 
  22 <matsubara>  * Operations report (mthaddon/herb/spm)
  23 <matsubara>  * DBA report (stub)
  24 <matsubara> [TOPIC] * Actions from last meeting
  25 <MootBot> New Topic:  * Actions from last meeting
  26 <rockstar> me
  27 <cprov> me
  28 <matsubara>     * matsubara to file a bug about the missing select permissions that delayed the rollout
  29 <matsubara>         * https://bugs.edge.launchpad.net/launchpad-foundations/+bug/353926
  30 <matsubara>     * cprov to look up soyuz bugs 353568
  31 <matsubara>     * matsubara to include francis suggestion to bug 353530 and ursinha to summarize what spm told her
  32 <matsubara>         * matsubara commented on the bug.
  33 <matsubara>     * salgado to debug and fix bug 353863
  34 <matsubara>     * sinzui to email the list how we should address critical bugs on unmaintained apps (e.g. blueprint)
  35 <matsubara>     * matsubara to talk to mrevell to announce a maintenance in the DB for about 10 min outage in the next 2 weeks. ask mrevell to talk to stub about it
  36 <matsubara>         * matsubara emailed mrevell about this.
  37 <ubottu> Launchpad bug 353568 in soyuz "ubuntu/source/package/+index timing out" [High,Fix released] https://launchpad.net/bugs/353568
  38 <ubottu> Launchpad bug 353530 in malone "OOPS filing a bug using the email interface " [Undecided,Fix released] https://launchpad.net/bugs/353530
  39 <ubottu> Error: This bug is private
  40 <ubottu> Launchpad bug 353863 in launchpad-registry "TypeError when finishing creating user account in lpnet" [Critical,Fix released] https://launchpad.net/bugs/353863
  41 <sinzui> matsubara: not done
  42 <Ursinha> the info I had was useless to the bug report
  43 <Ursinha> very superficial and not helpful, so I didn't add
  44 <flacoste> me
  45 <matsubara> cprov and salgado bugs are fix released. so that's done
  46 <matsubara> sinzui, do you want me to add another action for your item?
  47 <matsubara> for next week
  48 <sinzui> matsubara: please do
  49 <matsubara> [action] sinzui to email the list how we should address critical bugs on unmaintained apps (e.g. blueprint)
  50 <MootBot> ACTION received:  sinzui to email the list how we should address critical bugs on unmaintained apps (e.g. blueprint)
  51 <matsubara> [TOPIC] * Oops report & Critical Bugs 
  52 <MootBot> New Topic:  * Oops report & Critical Bugs
  53 <matsubara> go ahead Ursinha 
  54 <Ursinha> all right!
  55 <Ursinha> one puzzle for losas/stub, three bugs for foundations, three bugs for registry
  56 <Ursinha> flacoste, bug 354593, bug 353926, openid resetting password, bug 358498
  57 <Ursinha> sinzui: bug 357307, bug 358486, bug 358492
  58 <Ursinha> herb/stub: we're having *lots* of oopses like
  59 <Ursinha> https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1194D1005. I've sent one email to lp list and spoken with jtv about that, something is killing the db connections, so when a request tries to re-use a connection that died it oopses like that. So herb/stub: do you know what can be possibly happening to the db?
  60 <ubottu> Launchpad bug 354593 in launchpad-foundations "SSO exceptions views need proper branding" [High,Triaged] https://launchpad.net/bugs/354593
  61 <ubottu> Bug 353926 on http://launchpad.net/bugs/353926 is private
  62 <ubottu> Launchpad bug 358498 in launchpad-foundations "AssertionError OOPS on openid when resetting password " [Undecided,New] https://launchpad.net/bugs/358498
  63 <ubottu> Launchpad bug 357307 in launchpad-foundations "TypeError when creating new account in lpnet" [Undecided,New] https://launchpad.net/bugs/357307
  64 <ubottu> Launchpad bug 358486 in launchpad-registry "AttributeError when user is confirming new account and LP is checking if it was a suspended one" [Undecided,New] https://launchpad.net/bugs/358486
  65 <ubottu> Launchpad bug 358492 in launchpad-foundations "ProgrammingError OOPS resetting password" [Undecided,New] https://launchpad.net/bugs/358492
  66 <ubottu> https://devpad.canonical.com/~jamesh/oops.cgi/1194D1005
  67 <sinzui> Ursinha: Account and LoginToken are Foundations issues
  68 <herb> Ursinha: looking at the oops now...
  69 <Ursinha> sinzui, looking..
  70 <Ursinha> sinzui, thanks for changing that
  71 <stub> Ursinha: The OOPS is showing that the database reconnection isn't working as it should. Why that connection died isn't on that OOPS - it would have happened on the previous request handled by that thread.
  72 <Ursinha> so that's flacoste's too
  73 * sinzui did it a few minutes ago
  74 <flacoste> sinzui, for AuthToken: +resetpassword, do you think salgado could look into this one?
  75 <sinzui> flacoste: Not this week, He is gone. I can take the  +resetpassword in a few hours
  76 <flacoste> Ursinha: for the branding bug, how often does it happen?
  77 <flacoste> Ursinha: reworking these templates is going to happen, but it's not easy to fix now
  78 <stub> Ursinha: We have watchdogs that kill bad connections. I don't recall seeing any reaped connections from the appservers recently.
  79 <Ursinha> flacoste, about 6,7 a day
  80 <Ursinha> stub, we had 2 thousand oopses like this one I showed
  81 <flacoste> Ursinha: actually, i think these are related to the DisallowedStore error i'm seeing
  82 <flacoste> Ursinha: because these links don't appear on normal SSO pages
  83 <flacoste> stub: what the permission bug in 358492?
  84 <flacoste> nm, i know what this is about
  85 <Ursinha> stub, jtv said he saw a lot of "administrator terminated connection" errors on lp-errors-report list
  86 <stub> When?
  87 <Ursinha> yesterday
  88 <Ursinha> I couldn't find them
  89 <stub> Anyway - the reason the OOPS count is so high is the appserver isn't recovering like it should.
  90 <flacoste> stub: could you look at this bug tomorrow?
  91 <stub> I can look at the oops - I don't know if there is a bug yet (I think there is - not sure though)
  92 <Ursinha> for the db killing spree no, there isn't
  93 <stub> What db killing spree?
  94 <Ursinha> at least I didn't open one, I'll do now
  95 <Ursinha> ah
  96 <Ursinha> I mean, the oopses
  97 <Ursinha> the lots of oopses because of the appserver isn't recovering like it should
  98 <stub> ok
  99 <Ursinha> anyway, it's that bug you're talking about flacoste?
 100 <flacoste> yes
 101 <Ursinha> I'll open one and let you know
 102 <Ursinha> that's all for me
 103 <Ursinha> we have one critical bug, in progress
 104 <Ursinha> so, if matsubara has nothing else to say, oops section is closed
 105 <matsubara> [action] ursinha to file a bug about "appserver isn't recovering like it should causing too many oopses"
 106 <MootBot> ACTION received:  ursinha to file a bug about "appserver isn't recovering like it should causing too many oopses"
 107 <Ursinha> thanks sinzui, flacoste, stub and herb
 108 <Ursinha> and matsubara, of course
 109 <matsubara> intellectronica, can you move bug 269538 from fix committed to fix released?
 110 <ubottu> Launchpad bug 269538 in bugzilla-launchpad/bugzilla-3.2 "Compilation error in plugin when authenticating" [Critical,Fix committed] https://launchpad.net/bugs/269538
 111 <matsubara> or at least chase why it's not fix released yet?
 112 <matsubara> that bug been in fix committed for ages
 113 <intellectronica> matsubara: i have no idea what's going on with that. i'll talk to gmb about it
 114 <matsubara> thanks intellectronica 
 115 <matsubara> thanks everyone
 116 <matsubara> let's move on
 117 <matsubara> [action] intellectronica to talk to gmb about bug 269538
 118 <MootBot> ACTION received:  intellectronica to talk to gmb about bug 269538
 119 <ubottu> Launchpad bug 269538 in bugzilla-launchpad/bugzilla-3.2 "Compilation error in plugin when authenticating" [Critical,Fix committed] https://launchpad.net/bugs/269538
 120 <matsubara> [TOPIC] * Operations report (mthaddon/herb/spm)
 121 <MootBot> New Topic:  * Operations report (mthaddon/herb/spm)
 122 <herb> 2009-04-04 - Launchpad experienced an outage most likely due to hitting some connection limits on the DB. Some users may have experienced issues for up to 90 minutes.
 123 <herb> 2009-04-08 - Deployed r7947 to soyuz and xmlrpc servers.
 124 <herb> Bug 156453 and bug 118625 continue to be problematic for us. Just want to make sure I'm keeping it on your radar.
 125 <ubottu> Launchpad bug 156453 in loggerhead "production loggerhead branch leaks memory" [Critical,In progress] https://launchpad.net/bugs/156453
 126 <ubottu> Launchpad bug 118625 in launchpad-bazaar "codebrowse sometimes hangs" [High,Triaged] https://launchpad.net/bugs/118625
 127 <herb> That's all for this week, unless there are questions.
 128 <rockstar> herb, I have something to report here again!
 129 <herb> woohoo!
 130 <rockstar> herb, so we've identified the real memory pig.  Unfortunately, it won't be trivial to change.
 131 <matsubara> cool!
 132 <rockstar> herb, so we know where the issue is, and now we just need to schedule about two weeks and re-write loggerhead.
 133 <herb> haha
 134 <matsubara> hehe
 135 <flacoste> the problem is that he is serious :-/
 136 <herb> mine was a laugh of despair
 137 <matsubara> rockstar, so are you tackling that for 2.2.4 and maybe 2.2.5?
 138 <sinzui> flacoste: I believe the problem with +resetpassword is that it sends logintokens to users who have not setup a person yet.
 139 <rockstar> matsubara, well, I doubt it'll be 2.2.4, because mwhudson is on leave for so much of it.
 140 <rockstar> matsubara, what really needs to happen is that we need to be sequestered again for a week to do nothing but fix it.
 141 <flacoste> sinzui: sounds about right, that shouldn't happen :-)
 142 <sinzui> flacoste: I'll get this fix today
 143 <matsubara> by we, you mean you and mwhudson or the whole code team?
 144 <flacoste> sinzui: thanks a lot
 145 <rockstar> matsubara, mwhudson and I.
 146 <rockstar> matsubara, we got some really good work done at the Pycon sprints last week.
 147 <matsubara> there's all hands and uds coming, maybe during that?
 148 <matsubara> anyway, that's beyond the scope of this meeting.
 149 <matsubara> I think that's all. anything else for herb?
 150 <matsubara> [TOPIC] * DBA report (stub)
 151 <MootBot> New Topic:  * DBA report (stub)
 152 <matsubara> thanks herb 
 153 <stub> Can you describe the memory leak?
 154 <herb> thanks matsubara
 155 <stub> During the last rollout, one of the database patches turned out to be relying on database row ordering for some data migration, with the end result being some newly created rows on the slaves had different primary key values to the master and each other.
 156 <stub> This caused replication to block later when changes to the data on the master could not be duplicated on the slaves due to constraint violations, alerting us to the problem. We rebuild the slave databases to correct the problem (the safest way of recovering the situation).
 157 <stub> The corruption was not noticable to end users and did not infect the master, as only the internal database ids where affected.
 158 <stub> I was hoping to switch our master to the 16 core box, but public holidays and illness have put a hold on that this week.
 159 <stub> On the 6th and 7th, some batch jobs erroneously had their database connections terminated. Sorry about that. It is unlikely this was end user visible.
 160 <stub> echo... echo...
 161 <sinzui> oi oi
 162 <matsubara> stub, you're coordinating the downtime annoucement with mrevell, right?
 163 <stub> I will
 164 <matsubara> stub, ok. thanks.
 165 <matsubara> anything else for stub?
 166 <matsubara> thanks stub.
 167 <matsubara> I think that's all for today.
 168 <matsubara> Thank you all for attending this week's Launchpad Production Meeting.
 169 <matsubara> #endmeeting

DevelopmentMeeting20090409 (last edited 2009-04-09 15:48:29 by matsubara)