= QA Shepherd project developer notes = Design notes for the [[https://launchpad.net/qa-shepherd|Shepherd project]]. Algorithm details can be found on the [[QAProcessContinuousRollouts]] page. == User stories == * As a LOSA, I want to read an HTML report that tells me the latest fully QA'd revision of `stable` is, so that I can deploy that revision to production. * As an operator, I want to run a command-line script that will tell me what landings the shepherd knows about, and what the shepherd thinks the most recent deployable revision is, so that I can deploy that revision manually. ---- * As a user, I want to read an HTML report that tells me whether my branch has been promoted from QA to production, so I do not have to rely on yet more email to tell me what is going on. (Ursinha, gary) * ''We probably want to publish this on https://qa.launchpad.net'' * ''As a quicker version we can rsync the report to some static location.'' ---- * As an operator and developer, I want to read in the log files the detailed reasons that the shepherd took a particular action, so that debugging the tool is easier. (mars) * ''the log should record both the current state of the change sources and the state transitions that the shepherd executes based on said current state.'' * As an operator, I want to run the parts of the rollout process individually and on-demand, so that resolving problems is easier. (mars) * ''This implies a small sharp script for each part - finding branches, doing promotion, reporting the system state.'' * As an operator, I want to run any script without worrying if another copy of the script is already running and stomping on the data, so that on-demand runs and maintenance are safer and easier. * ''This implies a single-instance log, like a PID file.'' * As an operator, I want a loud warning if a PID file is more than X minutes old, so I have some foreknowledge that a hang or crash is blocking further updates from happening. (mars) * As a user, I want to be warned once a day by email that my branch is ready for QA, so that I remember to do it. (mars, Ursinha) * As a user, I want a message sent to a mailing list when a branch has been sitting in QA for more than X days, so that someone can roll back the branch and unblock updates. (mars, Ursinha) * ''This implies a QA policy with a grace period.'' * As an operator, I want to toggle a switch that keeps the scripts from running, so that I can do maintenance and updates without hunting down and disabling a bunch of cron scripts. (mars) * ''Probably dropping a maintenance.txt file on disk would do this.'' * As a user, I want the log file and HTML reports to tell me when updates were aborted because a maintenance.txt is in place, so that I know why updates aren't happening. (mars) * As an operator, I want the log file and HTML reports to tell me how long ago the maintenance.txt file was created, so I know if someone forgot to remove the file by accident. (mars) * As an operator, I would like the script to be deployed as the lpqateam user, so that I can find all of the parts of the QA process in the same place. (Ursinha, matsubara, gary) * As an operator, I want a single push-button command to set maintenance mode, update the code, then remove maintenance mode again, so that I am less likely to make mistakes when deploying a new version. (mars) * ''Need both maintenance.txt and some way to wait for executing scripts to finish their work before proceeding.'' === Other notes === * The QA tagger can pass the revno, branch name, and linked bugs through to the shepherd using its database. This saves the shepherd from doing the work. * As a developer, I want the qa-tagger to run in the same interpreter as the shepherd, so that I do not have to write tricky inter-process communication code. * If the tagger passes only the in-QA revisions through to the shepherd, then the shepherd does not have to store any persistent state. The shepherd can simply (re)sort the entire list of in-qa revisions and write its report.