Settings

Theme

Show HN: I git commit my home directory every night

github.com

9 points by LackOfGravitas 2 years ago · 6 comments

Reader

simonblack 2 years ago

Why is your home folder as large as 1.2TB?

What do you keep in there? It's a lot easier and more space-saving to keep the 'Write Once, Keep Forever' stuff like music, e-books, photos, source-files, videos, etc in a dedicated 'archive' folder.

That archive folder gets rsync'ed to multiple backups on a daily basis. The rsync process only takes a few minutes for around 4TB because you're only backing up files that have been added in the last 24 hours.

My /home folder only contains personal documents like tax stuff, and other records. Plus stuff that I'm actually working on at present. Plus stuff like personal app binaries and config stuff. I try to keep it under 12 gigs. I just looked and it's currently at 11 gigs. That complete snapshot gets backed up daily. Snapshots are gradually deleted progressively over months, with some snapshots being retained permanently.

  • LackOfGravitasOP 2 years ago

    Fair. I do have an archive folder in the home directory. Also, probably 70% of the space is in a large workspace/ folder that has different programming projects or open source tools in it. There are also a lot of art and graphics files, some raw and some the result of processing pipelines. Much of it is largely static, but not quite in the "Write Once, Keep Forever" category.

    I definitely admit that all these could be on other drives, but in this case I have found it easier to just have everything together and make sure folders are organized intelligently. At least in this case, the overhead of having separate backup procedures for different types of data is more than the marginal overhead of simply snapshotting everything.

    That said, I do have a 8 TB photography folder that isn't part of this snapshot routine.

rajatarya 2 years ago

Can you share any dedupe results as you've been running this? How long does it take nightly?

  • LackOfGravitasOP 2 years ago

    Currently, it takes probably 20-30 minutes to run, which does include all the data deduplication. It was a bit more for the first round, but still under an hour. On my home folder, which is about 1.2 TB and I think around 1.4 million files, the git repository itself is a few hundred MB and the local data store, on a zSTD compressed ZFS data pool on my True NAS server, is around 400 GB. The data deduplication pretty much just tracks with new content, so it works perfectly here for snapshots.

skadamat 2 years ago

Interesting idea -- any size limitations?

  • LackOfGravitasOP 2 years ago

    I think the only size limitations would be the size of the data store -- my 1.2 TB home folder with 1.2 million or so files becomes around 400GB data store on my NAS and around 200MB for the actual repo. I know that git slows down with number of files, but I haven't noticed any issues yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection