Diff for "ArchitectureGuide/Git"

Not logged in - Log In / Register

Differences between revisions 2 and 3
Revision 2 as of 2019-07-24 14:46:00
Size: 2956
Editor: cjwatson
Comment: update README link
Revision 3 as of 2019-07-25 10:01:11
Size: 3109
Editor: twom
Comment:
Deletions are marked like this. Additions are marked like this.
Line 13: Line 13:
The storage backend is basically a more secure version of git-daemon, speaking our extended git:// protocol, invoking git (the reference implementation) in a locked-down environment for maximum performance. It also knows how to invoke git in stateless-rpc mode to cope with the translation of the tunneled smart HTTP protocol, and can invoke a callback on eg. ref changes. The storage backend is basically a more secure version of git-daemon, speaking our extended git:// protocol, invoking git (the reference implementation) in a locked-down environment for maximum performance. It also knows how to invoke git in stateless-rpc mode to cope with the translation of the tunneled smart HTTP protocol, and can invoke a callback via an RPC hook system for notifying of ref changes. This also allows per ref permissions checking, enabling 'protected branches' and advanced repository permission checks.

Details

(See also https://git.launchpad.net/turnip/tree/README)

Git Pack

Three layers: storage, virt, frontend. Serves fetch and push requests directly to clients. Each layer is horizontally scalable within prodstack.

The frontend is composed of three separate components, each serving a single protocol: Smart HTTP, SSH (using Launchpad’s existing Twisted SSH infrastructure), or the native Git pack protocol. Smart HTTP and SSH call into Launchpad for authentication, but this layer performs no authorisation or path translation. It just processes any credentials and proxies the request down to the virt midend after converting it to a version of the git:// protocol, slightly extended to include authentication and HTTP information.

The virt midend proxies the extended git:// protocol to the storage backend, making API calls to Launchpad to translate human-readable paths to internal ID-based ones, and to authorise access. It will later also implement sharding and replication.

The storage backend is basically a more secure version of git-daemon, speaking our extended git:// protocol, invoking git (the reference implementation) in a locked-down environment for maximum performance. It also knows how to invoke git in stateless-rpc mode to cope with the translation of the tunneled smart HTTP protocol, and can invoke a callback via an RPC hook system for notifying of ref changes. This also allows per ref permissions checking, enabling 'protected branches' and advanced repository permission checks.

Git API

Parallel to the pack stack, sharing the same architecture and nodes.

Provides high-level repository APIs to the application. Wraps pygit2 to expose functionality similar to the Bazaar smartserver calls that we rely on today. Called into by new parts of the existing Launchpad application for log viewing, repository browsing, repository writes, etc.

cgit (temporary)

Parallel to the Pack and API stacks, sharing the same backend nodes.

Provides short-term repository browsing until the native implementation is sufficient. Slightly hacked up to refer to repositories by their virtual names, rather than the by-ID URLs used internally.

Launchpad integration

The existing launchpad.net frontends will serve native, SSH and smart HTTP clients. The native pack protocol and SSH are easy, as they run on different ports from the rest of the system and can just be haproxy’d.

Smart HTTP is more difficult: we don’t want to proxy long-running requests through the webapp, but we can’t easily detect Git requests, as the initial info/refs request doesn’t have a special Content-Type or Accept header. However, the actual long-running requests do have a special Content-Type that we can ask Apache to mod_rewrite directly to the Git stack, which leaves just info/refs to be proxied, which should be quick and safe.

Bitbucket rewrites on User-Agent, and we think GitHub does for non-.git URLs too (explaining why bzr-git can’t read https://github.com/foo/bar, but https://github.com/foo/bar.git works).

ArchitectureGuide/Git (last edited 2019-07-25 10:01:11 by twom)