A backup rotation filter for the Unix shell

git.sr.ht

97 points by closeneough 6 years ago · 32 comments

Reader

> To list backups that will should be kept use the --invert option.

May I suggest changing the name to something more direct? Calling it "--invert" means the user must think through what the default sense of the test is, then negate that in their mind. Not exactly a tough mental task, but people do make careless errors.

Perhaps something like "--list=keep" and "--list=discard" (with the default being "discard").

Also, typos:

"will make prunef to keep" --> "will make prunef keep"

"list backups that will should be" --> "list backups that should be"

closeneoughOP 6 years ago

Thanks for the input and for pointing out the typos. They are fixed now.
I will consider changing the invert flag, but I'm not that happy with something like "--list=...". There will be only two modes with discard being the default one. So imho there should be only one flag to switch to the non-default mode.
- closeneoughOP 6 years ago
  
  I've decided for --list-kept. --invert will be supported to not break scripts.

yjftsjthsd-h 6 years ago

Handy looking tool:)

Meta: I'm excited to see sr.ht starting to pop up in the wild like this:) I hope this is part of it starting to take off.

juped 6 years ago

It's very refreshing not to have some web 7.0 (or whatever we're up to now) site trying to "engage" me when all I want is to look at a source code repository.
brobot182 6 years ago

The layout, at least on mobile, could use some work
- ddevault 6 years ago
  
  I know this was a week ago, but on the off-chance you see this, I spent today improving responsiveness throughout. Let me know if you encounter any more issues.
- leni536 6 years ago
  
  What problems do you find? I can't find any.
  - OJFord 6 years ago
    
    Not GP, but it gives me about 5-10% horizontal scroll, for the benefit of the last few letters of 'contributors' and background to 'summary'.
    It's fine though, I leave most blame at iOS' door. Everything looks too big on this temporary iPhone SE, and I'm not allowed to zoom out, default to Firefox, or use any extension or 'content blocker' in it even though it's forced to use Safari to render. (/Rant..)
    
    saagarjha 6 years ago
    
    > It's fine though, I leave most blame at iOS' door.
    I think this is just some missing CSS to hand this case.
    > Everything looks too big on this temporary iPhone SE, and I'm not allowed to zoom out, default to Firefox, or use any extension or 'content blocker' in it even though it's forced to use Safari to render.
    On my very much not temporary iPhone SE, I can use content blockers and zoom out…
    
    OJFord 6 years ago
    
    Not in FireFox you can't.
    And I mean 'zoom' out from the default, e.g. on desktop I browse most sites at 80% in FF, some 67 or 50, fewer at 100.
    On my in-for-repair Android phone, the smallest system UI/font setting is smaller, and FF is allowed extensions and to use its own renderer, so I have control over that.
    Everything just seems like I'm using a largified accessibility mode. And typing - impossible to place the cursor mid-word? So if suggested corrections are wrong, no choice but to delete and re-type the whole thing. And no select all? So if I decide not to post such a rant, I have to fumble with the two cursors and move one to each end myself.
  - brobot182 6 years ago
    
    It plain looks bad. But also, it requires more scrolling to get to the meat of the project
    
    battery_cowboy 6 years ago
    
    Looking bad is not objective. I love the Sourcehut look, and the fact that I have no issues using it on literally any device. I can run my whole infrastructure from Sourcehut and Linode, together, on my crappy phone because they have very mobile friendly sites.

speedgoose 6 years ago

I use rotate-backups with temporary files and your script is an interesting alternative.

https://pypi.org/project/rotate-backups/

inshadows 6 years ago

FYI the interface format seems to be inspired by borg-prune[1].

[1] https://borgbackup.readthedocs.io/en/stable/usage/prune.html

closeneoughOP 6 years ago

Yes, this was a starting point for me. I was referencing to borg prune in my initial readme, but I dropped it because the algorithm works differently and I wanted to avoid confusion.

usr1106 6 years ago

The original title reads "... for your Unix shell", suggesting that the code is portable to many shells. I have not verified that claim.

"... for the Unix shell" makes little sense. When I used Unix my shell was csh, later tcsh. Nowadays my shell is bash in most cases and dash in some more limited environments. Either case, "the Unix shell" does not exist.

epx 6 years ago

Did something similar but employed a Fibonacci sequence, using the hour as the unit (but could be a minute, or a second), to groom a collection of snapshots maintaining a sensible timeline: https://epxx.co/logbook/entries/fibo_en.html

bArray 6 years ago

If anybody else is using a similar method, I think it's also good to encrypt the backups:

    tar -zcv <SRC_BACKUP> | gpg -c --batch --passphrase <PASSWORD> -o <DEST_BACKUP>.gz.gpg

In my experience the encryption part doesn't add any extra time on a modern machine, with the spinning disk being the slowest cog.

NieDzejkob 6 years ago

There's a lot of complexity in gpg that's unnecessary for this usecase, I'd use age instead.
https://age-encryption.org/
- yjftsjthsd-h 6 years ago
  
  Sure, but I trust GPG. Has age been audited? Has it been attacked in the wild and held?
- voidmain 6 years ago
  
  gpg is widely agreed to be a train wreck, but age is not generally suitable for encrypted backups because it doesn't offer any form of authentication (so an attacker can replace a backup with a malicious one).
  - cyphar 6 years ago
    
    Can you elaborate on the threat model a little bit? I'm struggling to understand how you can protect against this in a way where an attacker that both knows the relevant encryption key (whether public or secret, depending on whether the crypto is asymmetric) and has write access to the backup location.
    If you sign the backups with some distinct key of the backup server, why wouldn't the attacker have access to those keys too (in the above scenario they already have access to the keys that the backup server is using for encryption).
    I know there's an open issue in age for adding authentication[1], so there clearly is some threat this would protect against but I can't figure it out.
    [1]: https://github.com/FiloSottile/age/issues/59

djsumdog 6 years ago

This is awesome. I currently use duplicity for backups to BackBlaze, but my DB backups I just place a files and I've just been pruning old one's manually for a while. Something like this, that lets me specify the file format, would be perfect!

stevekemp 6 years ago

For database backups I've always done the simplest thing, I take a daily dump of "unchanging" databases:
* /var/backup/db/db1/monday.sql
* /var/backup/db/db1/tuesday.sql
* /var/backup/db/db1/monday.sql
For databases that change more frequently I instead backup every 1, 3, 4 hours as appropriate:
* /var/backup/db/db2/monday/00.sql
* /var/backup/db/db2/monday/04.sql
The appeal of this is that I always have "local" backups, and I don't need to consider rotation at all each one gets the most recent copy when it runs, and I have an alert/alarm to make sure files are recent enough that things aren't broken. I appreciate that if your databases dumps are 600Gb each, or something similarly sized you'd waste a lot of space, but for small things the simplicity of this approach is a good win.
(These get copied offsite as part of the backup of the whole filesystem. In the past I used to backup only some stuff, that failed the first time I tried to restore a mailserver and didn't have /var/lib/mailman archived! These days I explicitly backup "/" excluding only /tmp, /proc, /sys, and /dev.)

ggm 6 years ago

Could you leverage information in zfs snapshots metadata to drive things like this? The copy on write semantics inherent seem plausibly related.

(Zfs snaps make awesome backups too, but unlike tar are inherently tied to zfs)

FrancoisBosun 6 years ago

Similar concept, in Ruby, acting as a filter in a pipeline, by me:

https://github.com/francois/surrender

anonsivalley652 6 years ago

PSA: Be sure to test every backup completely before succeeding a backup job, or it's not a backup.

aquabeagle 6 years ago

tarsnap desperately needs something like this included with it.

aorth 6 years ago

Also see tarsnapper. I've been using it for years.
https://github.com/miracle2k/tarsnapper
pronoiac 6 years ago

You might like this (self-plug): https://github.com/pronoiac/tarsnap-cron
closeneoughOP 6 years ago

I also was missing such a feature. That's why I build this.

Settings

A backup rotation filter for the Unix shell

Keyboard Shortcuts