= Remove the hosted area in codehosting = ''Remove the distinction between the hosted and mirrored area'' '''On Launchpad:''' ''https://blueprints.edge.launchpad.net/launchpad-code/+spec/remove-hosted-area'' '''As a ''' developer using Launchpad's code hosting<
> '''I want ''' changes I make to branches over bzr+ssh to be available over HTTP as soon as I make them<
> '''so that ''' interacting with Launchpad involves less waiting<
> '''As a ''' LOSA<
> '''I want ''' codehosting to not use twice as much disk space as it needs to<
> '''so that ''' I don't have to buy and install more disks so often<
> == Rationale == Doing this mainly wastes less resources, and so benefits our sysadmins. Although this is mostly an architectural change, it should make Launchpad simpler to use by removing an obscure concept users need to understand. == Stakeholders == * Canonical Admins, I guess. I haven't talked to them for a while. == Constraints == No functional change. Use less disk space than current implementation. == Success == ''How will we know when we are done?'' When we can delete the /srv/bazaar.launchpad.net/push-branches directory on crowberry. ''How will we measure how well we have done?'' This is a pretty binary thing :-) == Thoughts? == How do we prevent abuse of Launchpad as a file hosting service? * I guess we don't care too much. Some kind of monitoring for abuse, there shouldn't be a security risk here as the branch data is only served over http (or some ssh backed protocol), not https and the interesting launchpad cookies are marked secure. What do we do if someone uploads a branch reference to or a branch stacked on somewhere cheeky? * We have to be careful about opening branches. We can probably do this by making it easy to open branches the safe way (i.e. by having IBranch.getBzrBranch DTRT). Places that open branches for writing: * codehosting, sort of -- the launchpad specific stuff is all at the transport level though * the puller * createmergeproposaljob -- the bundle -> branch + merge proposal stuff * the translations export to branch stuff Places that open for reading: * all of the above * codebrowse * translations import * scanner * other places?? Although there would be no need to have a puller for hosted branches, in the sense that there will be no revisions to pull, there are still some things that need doing: the stacked on URL may need massaging and certainly needs recording, a scanner job needs creating, a few fields like last_mirrored_id may need updating (if anything still cares after this work is done). ''However'' we can probably do this immediately, in the codehosting process itself and by extending the XML-RPC call the puller currently makes to trigger a puller run. The above-mentioned stacked-on URL massaging means we'll be directly editing the data the user has uploaded for the first time. I don't think this is a big issue though. In general, the changes will make hosted branches less like mirrored and imported branches, but I think the current similarities are a bit artificial. The fact that the mirrored area contains a copy of the branch has occasionally allowed us to easily recover from the version in the hosted area getting trashed somehow. The codehosting vfs won't actually change that much, although perhaps it makes sense to rename hosted_transport and mirror_transport to ro_transport and rw_transport or something. Various parts of launchpad that wait until the branch has been pulled before doing things should be changed to not do that. The branch-distro.py script would get much simpler and likely faster. And wouldn't clog up the puller for 6 hours after it runs. reclaim_branch_space will be simpler too. Lots of places that sets up branches for integration testing should become easier to understand. It's hard to come up with a way of doing this work that can be landed in small-ish branches. I think a sane implementation plan is to have a pipeline that works component by component, probably in this order: modify codehosting, modify puller, fix fallout. Requirements from talking with the strategist: 1) talk to IS, starting with spm 2) testing plan 3) no regression on error display 4) make sure MP code doesn't fall over on broken branches