[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[jgit-dev] pubsub work coming to JGit
- From: Shawn Pearce <spearce@xxxxxxxxxxx>
- Date: Fri, 15 Jun 2012 17:02:17 -0700
- Delivered-to: jgit-dev@xxxxxxxxxxx
- List-archive: <https://dev.eclipse.org/mailman/private/jgit-dev>
- List-help: <mailto:jgit-dev-request@eclipse.org?subject=help>
- List-subscribe: <https://dev.eclipse.org/mailman/listinfo/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=subscribe>
- List-unsubscribe: <https://dev.eclipse.org/mailman/options/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=unsubscribe>
Matthias asked what Ian's changes are about, here is my weak attempt at describing it... Ian is working on adding a pubsub feature to Git. At its simplest form a client has a group of repositories that are already local that the client wants to keep current. For example, I have the EGit and JGit repositories on my workstation. I want those to always have available to me the latest master that has been submitted on git.eclipse.org. The pubsub client will be a JGit process running in the background on my desktop, maintaing a persistent TCP socket with the git.eclipse.org server. When this connection starts, the client tells the server which repositories it wants (e.g. "jgit/jgit, egit/egit"). Whenever changes are made at the server that the client is interested in, data is pushed directly to the client over this persistent TCP socket. This is really built not for the small-ish egit/jgit case, but for the Android case where there are 400+ repositories constantly changing at the server end. By registering subscriptions, clients can be informed of updates, rather than polling for them with big for loops around git fetch commands. It also really helps with a large number of clients. Instead of computing deltas to update a client from its current position to the server's branch tip, the server creates a pack once at the time of branch update to go from the current branch value to the new branch value, and then distributes that pack to all interested clients. Most of these packs are going to be small enough that they can be held in memory in the server and dumped out through an NIO/select/poll type of distribution to the clients. This saves a lot of server resources. For clients it means they might only be seconds behind the server at any given time. Which means doing a `git pull origin master` is no longer network bound, but instead just has to update the local working directory. For remote distributed offices we are considering building a proxy in the office that knows how to aggregate subscriptions upstream, and fanout the data to its clients. This means a distributed office might only need to have the data sent to it once, rather than N times for N workstations. By making the proxy just a stream duplicator it has no state, and does not really need to worry about the security of the data it stores, its all transient in RAM. It also doesn't need to worry about doing `git gc` on the proxy, as the proxy isn't really a GIt repository. Its just a forwarding service. The initial implementation is going into JGit, hopefully before Ian finishes his internship with us. :-)
- Follow-Ups:
- Re: [jgit-dev] pubsub work coming to JGit
- From: Matthias Sohn
- Re: [jgit-dev] pubsub work coming to JGit
- Prev by Date: Re: [jgit-dev] Commits and tags
- Next by Date: Re: [jgit-dev] jgit-dev Digest, Vol 31, Issue 13
- Previous by thread: [jgit-dev] Commits and tags
- Next by thread: Re: [jgit-dev] pubsub work coming to JGit
- Index(es):