Filesystem Tagging
by Alex Lance
2025-12-071207 words, tech
I found an interesting way to tag my old digital photo albums. No database or index files. The metadata is directly attached to each file and backed up with my normal rsync jobs.
It turns out that many filesystems support extended attributes (xattrs). And that you can use these to stick interesting metadata onto a file. In my case, arbitrary tags. Join me as I take you into the exciting nightmare that has been me categorizing 23 years of photos over the last week. I'm not a hoarder, you're a hoarder.
Let's jump in with an example. There are two main commands: getfattr ("get file attribute") and setfattr ("set file attribute"). They can be installed on Debian by installing the attr package. Here's how they run:
Eg: display any tags that currently exist in a file "photo.jpg"
# No output, this file doesn't have any tags # It doesn't like using a top-level name like "colour" without a namespace prefix # It runs successfully
Try and set a tag named "colour" to the value "red" for photo.jpg
There are five top-level namespaces, one of them "user" is for our diabolicalness
Now let's check it for any tags...
# Hooray now it has a tag
(spoiler: I wrote a wrapper script to help automate this en masse)
The problem
That was one piece of the pie. But there are more pieces. One might suggest too many pieces.
I've been backing up my files for decades now. But they're a mess. Different files named in different ways, stored in different folders. A floordrobe of filing.
The first step, break some eggs: One new folder per year, eg:
# Make some folders # One for each year # Eg 2025 # Ignore these paths # Look for files with the right suffix, that have a modified year within the right time range # and look for files with the year embedded in the filename
Then some malarky to move all the untagged files into their correct year folder. Note:
this required a lot of manual intervention. Here's a script that generates mv
commands.
You can see that the script has two main considerations: Files that were
modified (i.e. likely created) in the particular year, and also files that looked like
they had a naming convention (thank goodness most cameras do this) of putting the year in the
filename.
move.sh
#!/bin/bash
set -euxo pipefail
SCRIPT_DIR=$(dirname "$(readlink -f "$0")")
cwd=$(dirname ${SCRIPT_DIR})
Invoked like:
# Generate a script containing a list of mv commands # Btw manually review the file before executing
Tagging with Gwenview
Ok! So now we've got millions of files sitting in per-year folders. Let the tagging commence. Just one problem, how to tag each file?
The open source application Gwenview allows you to select multiple files and then edit the tags for all of those photos in one hit. It stores the values for the tags in the extended file attribute name user.xdg.tags. Which is sort of an unofficial convention - but it'll do!
Useful Gwenview shortcuts include:
Ctrl-click -> select multiple individual files
Shift-click -> select runs of sequential files (very useful!)
Ctrl-t -> open the tagging dialog
And as you'll see below I also wrote a script to edit tags in bulk too.
Supporting Software
Couple of tricks with other tools that are worth mentioning when working with files that have tags.
rsync has 2 handy flags:
ensures the attributes are synced around too when moving or backing up files.
is useful when moving files from your phone (with its filesystem that does not support extended file attributes) to your archive. It helps ensure you don't overwrite a file you've already tagged previously, by only copying the file over if its contents have changed.
vim has a configuration option which you must set if you're editing a previously tagged file and you don't want to lose the tags when you hit save: set backupcopy=yes.
Tag: a wrapper script
Lastly, I ended up writing a wrapper around getfattr and setfattr for working with multiple files. It's just called "tag" and you can grab it from over here: Tag.
tag -v or --view fires up Gwenview to see images that have a particular tag:
# Show photos from Xmas 2025 # Show Xmas 2025 that don't have Steve in them # Show all photos in current directory that don't have any tags at all # VERY useful for making sure you have tagged everything # Handy for command substitution: # Eg: mv $(tag -s roof) roof-folder/ # Add the tag xmas2025 to all the files in the "xmas" folder # Or remove a tag (i.e. a tag typo in this example)
tag -s or --search same syntax for tags as with --view, but instead of opening up Gwenview,
just print the list of files that match, eg:
tag -d or --dump dump out all the tags that are in use in the current directory
along with the tallies:
$ tag -d
122 gumtree
74 theatre
53 camping
47 dibs
28 therats
27 art
18 mum
18 dad
9 roof
9 car
tag -e or --edit to actually add or remove tags from some files. Eg:
In conclusion
Some tags that I've found useful: the street names and suburbs of places I've lived. People's names or surnames, and whether a particular photo is sensitive/personal. Also whether something is a photo of a document, whether it's related to a workplace, or whether it was an artwork. Oh and lots of photos of my soulful whippet, who is missed.
Hopefully this is useful for you. If I had to do it over again I definitely would. But it was a lot of work. The whole process has simplified my backups though, and made it possible to quickly find files that are useful, and also delete a bunch of stuff that didn't need to be kept.
Now that so many things are tagged - I wonder if I could use that info to train a not-internet-connected thingo to tag things automatically in the future...