Using git to manage a website

172 points by achew22 15 years ago · 51 comments

Reader

veb 15 years ago

Is it bad, that I simply SSH into the server and do a "git pull"?

gvb 15 years ago

To elaborate on dolinsky and the article, if your web server is doing a "git pull", it means it has ssh access into your workstation. If someone breaks into your web server, this means that they have ssh access into your workstation as well by simply using the keys on your web server. This is bad, very bad.
If you push to your web server, only your public key is exposed if your web server is compromised.
- russell_h 15 years ago
  
  Not necessarily. If you run an ssh-agent locally and configure ForwardAgent to 'yes' for connections to your web server you can ssh to your server and use ssh from it without actually putting your private key on it.
  I'd still recommend pushing to a server though.
  - mcmatterson 15 years ago
    
    An excellent discussion about this exact topic came through HN yesterday. See http://news.ycombinator.com/item?id=2183415, and in particular the comments.
    
    alexgandy 15 years ago
    
    Thanks for pointing this out.
- cookiecaper 15 years ago
  
  I don't know if he meant that he git pulls from his workstation. I git push out to a bare repository on my server, and then I ssh in and git pull from the local bare repository into the project's working directory on my server. This doesn't leave the keys for my workstation on the server, but I still have to log in and git pull in the wd.
dolinsky 15 years ago

Good / bad are relative terms. Neither situation would be recommended for a site with a code base that pulls from multiple resources / pushes to multiple servers on every release, but for a single server environment this could suffice.
As for the difference b/w your method and the OPs, he defines it here :
> This is more convenient than defining your workstation as a remote on the server, and running "git pull" by hand or from a cron job, and it doesn't require your workstation to be accessible by ssh.
rmc 15 years ago

One advantage of the article's method is that it works if your workstation doesn't have a public IP address or is behind NAT / a firewall. It also works if you move around. With the article's method, you could have your laptop at home update your website, then go down to the local coffee shop and update your website from there.
steveklabnik 15 years ago

Only once you have more than one server.

I use Fabric with mercurial to achieve a similar effect typing "fab prod deploy" - see Steve Losh's blog posts or bitbucket.org/kevinburke/goodmorningcmc

rapind 15 years ago

You can do the same with Capistrano. However for a very basic static site it does sound simpler to just use a post-receive hook rather than having a script ssh in and do a pull.
StavrosK 15 years ago

Yep, same here. I wonder why more people don't do it, my fabric scripts basically push to production and then fabric logs in and updates, restarts services, etc.
trusko 15 years ago

That's my preferred way of deployment as well. fabric + git

glenngillen 15 years ago

It's probably worth checking out NestaCSM (http://effectif.com/nesta) if you're interested in doing this.

Sinatra in front, but all your posts are managed by Git and can be Markdown/Textile/HAML (or anything supported by Tilt iirc). Push it to Heroku if you want easy/free hosting.

Takes care of publishing an RSS feed, tags/categorisation, and a bunch of other nifty things beyond just generating a static site.

Example: http://blog.peepcode.com/

(disclaimer: I used to work with Graham who created it, but I genuinely think it's awesome and use it for almost every site I build now)

Pyrodogg 15 years ago

I've been using Joe Maller's write-up[1] as a guide for a while.

I see this strategy removes the need for a second repo on the server. Other than removing a layer which would save space and lower the likelihood of errors in general are their significant pros/cons to either method.

[1] http://joemaller.com/990/a-web-focused-git-workflow/

georgecmu 15 years ago

Can I use git to manage a Drupal-managed website?

fungi 15 years ago

that's what we do in conjunction with http://drupal.org/project/features, bugs aside it works well
but what you want in code and what you want in the DB will vary from page to page and feature to feature so often just manually redo stuff (in dev env and then again in production) because deploying changes by code is more effort then its worth.
- georgecmu 15 years ago
  
  Managing the codebase is straightforward -- that's what git's designed for. My question was more about the actual site contents, which will reside in the database. Can you use git for tracking changes to the contents?
  - fungi 15 years ago
    
    > Managing the codebase is straightforward
    not always, often stuff is stored in the db that we want to work on in dev then deploy to production.
    but it would be nice to be able to export a selection of nodes to code and preserve there id's, content and metadata. maybe you'll find a features contrib mod http://drupal.org/taxonomy/term/11478 but i've never heard of one.
  - davegan 15 years ago
    
    No, git will not track database changes. Features however can handle much of the configuration (that is typically contained in the database) in code, which can then be managed with git.
    If you want to create content on your dev or staging site and push it to production, check out the deploy module: http://drupal.org/project/deploy
    Tracking changes to nodes in production is probably best left to Drupal's built in node versioning system. Diff (http://drupal.org/project/diff) is a handy module to track changes between revisions and if you want a little more advanced workflow check out revisioning (http://drupal.org/project/revisioning) and workflow (http://drupal.org/project/workflow). Good luck!
    
    georgecmu 15 years ago
    
    Great info, thanks!
  - bricestacey 15 years ago
    
    Probably not. You're better off logging your MySQL queries and filtering out the UPDATE, INSERT, and DELETE queries for the tables you're interested in.
    
    RobGR 15 years ago
    
    I would advise against that. There are too many places in he mess of tables of data configuration that might refer to specific auto-increment id columns were things will get out of sync. Basically you are trying to do MySQL replication on specific tables only, but there are too many relationships among all the tables.
    If it is a small amount of content, such as a handful of pages for a brochure site, putting them in a feature might be best.
    If it is a site where customers or visitors generate content on the site, then periodically re-initialize your dev and staging environments with copies of the live DB, running it through a script to anonymize all the user info and change passwords and etc.
    If you want to create content on your dev and move it live, then doing a full database copy and moving it live, with settings that are particular to live overridden in settings.php, is probably the best way. I advise against this, basically you are re-launching the site for every change.
    If you are truely desiring to be able to create content in multiple different places and move it to live, then probably the best thing to do explicitly program and configure for that, by specifying feeds of the content and setting up the different sites to ingest each other's feeds.
    
    georgecmu 15 years ago
    
    FWIW, I found two takes on database versioning: http://www.gsdesign.ro/blog/mysql-database-versioning-strate... https://launchpad.net/dbvcs
    
    georgecmu 15 years ago
    
    Interesting suggestion. Do you know if anyone is doing it this way?
ElbertF 15 years ago

Why not? Were I work we use Drupal almost exclusively and use Git for everything.
- georgecmu 15 years ago
  
  Isn't the content stored in a database rather than static pages? Do you do regular commits of your database files?
  - jdbeast00 15 years ago
    
    if this was possible and useful drupal wouldn't have Features. the whole point is whats in the database can't be versioned easily.

soult 15 years ago

That's similar to my private website (soultcer.com). I use git to create and store the content, and a wiki as content management system. It's nice to work on your website, and all you need is a git push for deploy. If I make a mistake or someone vandalizes the wiki, reverting is easy.

clickable: http://www.soultcer.com/

Edit: In case you are interested, the wiki software was written by a friend and is open source: https://github.com/patrikf/ewiki

chalst 15 years ago

The general approach of putting together your website on one machine, the coding machine, and publishing it to your server is one I use, although I prefer a complex build on my design machine and then rsync it with the server.

Checkout Chronicle (http://www.steve.org.uk/Software/chronicle/). I've put up a HN thread (http://news.ycombinator.com/item?id=2186798).

jbrennan 15 years ago

This is basically how I run my publishing engine (you can read more about it at the "Colophon" part of this article: http://nearthespeedoflight.com/article/about_the_redesign ).

I wanted to teach myself Ruby and I figured this was a great way to maintain a site, as I'm terrible with both SQL and security. Git solves security and DataMapper solves the SQL, and my Ruby lubricates the rest.

soult 15 years ago

You could use git directly for metadata instead of relying on some JSON file. Git will tell you when an article was created and by whom. It will also tell you when and by whom each edit was made.
- jbrennan 15 years ago
  
  True. But my metadata needs exceed simply who wrote the article, I also have things like tags, pubdate, update, etc.
  My system actually allows for the metadata to be embedded in the main article file, but it ended up being simpler for me to split the files most of the time (the editing app I wrote does this for me, so I mostly forget about it now).

dedward 15 years ago

I've played with this - in the end I ended up using custom scripts and/or Capistrano scripts (along with git of course) to handle actual deployments. It provided more control and more features, while still letting me leverage git.

bigiain 15 years ago

Yeah, sounds familiar. I've been cooking up a way to manage a locally mastered static html website hosted in Amazon CloudFront. I may well use it as an opportunity to learn git (over SNV)...

trusko 15 years ago

Have a question. This is nice for simple HTML. How about deploying site were you have to migrate the database etc. This approach wouldn't work. I use fabric with git right now, works well.

jefe78 15 years ago

One solution that occurred to me, was to setup a(assuming MySQL)mirrored database. Another idea is to setup a MySQL dump script or something similar.
Just a couple of ideas :)
- trusko 15 years ago
  
  I think it is good idea for static sites, for everything else you would be rediscovering fabric and similar tools.

buckwild 15 years ago

What about using git to backup an entire hard drive? Maybe using bitbucket (or github if you don't mind sharing your data with the world).

Living on the cloud has never been so easy :-D

beoba 15 years ago

You should look at something more along the lines of rdiff-backup (or whatever's fashionable these days) for that.
Using git would be pretty pointless, unless you're going to be frequently branching and merging against your mp3s or something.
SeoxyS 15 years ago

Performance would be nightmarish. Git is super elegant and great, but not very performant. Imagine having to deltify a terabyte of data? Good luck with that!
- riledhel 15 years ago
  
  wasn't it because he needed better performance one of the reasons Torvalds created git? http://en.wikipedia.org/wiki/Git_(software)
  - rmc 15 years ago
    
    Although git usually has faster performance than most other VCS, that doesn't mean it can easily handle terabyte disk image files. :P
  - beoba 15 years ago
    
    http://en.wikipedia.org/wiki/Bitkeeper#Pricing_change
js2 15 years ago

See https://github.com/apenwarr/bup
beagle3 15 years ago

bup! bup! bup!
https://github.com/apenwarr/bup

boyter 15 years ago

I do something similar, but I keep the "central" repository in another location on the server and pull from that.

andrewcamel 15 years ago

That's what I do, except I use svn rather than git.
- boyter 15 years ago
  
  Nice to hear I am not alone on this. The advantage I find is that it makes my deployments more explicit, and my pushes too and from the repository are as often as I feel.
  With GIT thats a moot point in the linked article so long as you are using branches for everything and remembering to push them as well, but sometimes I just want to fix something quickly and do so without a branch. I dont care what people say when you start storing a lot of stuff in GIT a branch can take some time to process.
  - steveklabnik 15 years ago
    
    You're not forced to use branches with git. They're just so easy that there's no reason not to.

JamieEi 15 years ago

So in other words Heroku minus all that pesky code?

Settings

Using git to manage a website

Keyboard Shortcuts