dskDitto
dskDitto is a fast, parallel duplicate-file detector with an sleek TUI that lets you review, keep, or safely delete redundant files.
Features
- Concurrent directory walker tuned for large trees and multi-core systems
- Targeted mode to search for duplicates of a single file
- Multiple output modes: TUI, bullet lists, or text-friendly dumps
- Optional automated duplicate removal with confirmation safety rails
- Profiling toggles and micro-benchmarks for power users
Install
Install straight from source using Go 1.22+:
go install github.com/jdefrancesco/dskDitto/cmd/dskDitto@latest
This drops the binary at $(go env GOPATH)/bin/dskDitto (or ~/go/bin by default).
Build From Source
Ensure you have
go(1.22+)gosec(install viago install github.com/securego/gosec/v2/cmd/gosec@latest)
git clone https://github.com/jdefrancesco/dskDitto
cd dskDitto
makeThe resulting binary lives in bin/dskDitto. Add it to your $PATH or run it from the repo root.
Install the built binary somewhere on your path (defaults to /usr/local/bin) with:
sudo make install PREFIX=/usr/local/bin
Override PREFIX (for example make install PREFIX=$HOME/.local/bin) if you prefer a user-local install and want to skip sudo.
Usage
dskDitto [options] PATH...
Common flags:
| Flag | Description |
|---|---|
--version |
Print the current version and exit |
--no-banner |
Skip the startup banner |
--profile <file> |
Write a CPU profile to the given file |
--time-only |
Exit immediately after the scan, printing only the elapsed time |
--min-size <bytes> |
Ignore files smaller than the provided size |
--max-size <bytes> |
Skip files larger than the provided size (default 4 GiB) |
--hidden |
Include dot files and dot-directories |
--exclude <path> |
Exclude a path from scanning (repeatable; excludes descendants) |
--no-symlinks |
Skip symbolic links |
--empty |
Include zero-byte files |
--include-vfs |
Include virtual filesystem directories such as /proc or /dev |
--current |
Restrict the scan to only the specified paths (no recursion) |
--depth <levels> |
Limit recursion to <levels> directories below the starting paths |
--dups <count> |
Only show groups that contain at least <count> files |
--text, --bullet |
Render duplicates without launching the TUI |
--remove <keep> |
Operate on duplicates, keeping the first <keep> entries per group |
--link |
With --remove, convert extra duplicates to symlinks instead of deleting them |
--file <path> |
Only report duplicates of the given file |
--hash <algo> |
Select hash algorithm: sha256 (default) or blake3 |
--csv-out <file> |
Write duplicate groups to CSV |
--json-out <file> |
Write duplicate groups to JSON |
--fs-detect <path> |
Print the filesystem type that contains <path> |
--color-safe |
Use a high-compatibility TUI theme that avoids custom colors (best for problematic terminal themes) |
Press Ctrl+C at any time to abort a scan. When duplicates are removed or converted, a confirmation dialog prevents accidental mass changes.
Duplicate removal and symlink conversion
dskDitto never deletes or rewrites anything unless you explicitly ask it to with --remove.
- Dry / interactive modes: by default (or with
--text/--bullet) the tool only reports duplicates. - Delete extras: use
--remove <keep>to delete all but<keep>files in each duplicate group. - Convert extras to symlinks: combine
--remove <keep> --linkto replace extra duplicates with symlinks pointing at one kept file per group.
In the TUI you can also convert the currently marked files into symlinks: mark the duplicates you want to replace, then press L and enter the confirmation code. Each group’s symlinks will point at one unmarked file in that group.
On Unix-like systems, multiple hard links to the same underlying file are treated as a single entry during scanning: dskDitto hashes the content once and does not report those hard-link paths as separate space-wasting duplicates.
When using --link, the on-disk layout after the operation looks like this for a group of 3 identical files and --remove 1 --link:
/path/to/keep/file.txt # original file kept
/path/to/dup/file-copy.txt -> /path/to/keep/file.txt (symlink)
/another/location/file.txt -> /path/to/keep/file.txt (symlink)
In the TUI, files that are symlinks are annotated with a [symlink] suffix so you can see which entries were converted.
Single-file duplicate search
Use --file /path/to/original.ext to hash a specific file first, then scan the provided directories for other files with identical content. If no duplicates are found in those directories, dskDitto exits cleanly; otherwise, all reporting/removal/export modes are limited to that single duplicate group (with the original file listed first).
Hash algorithms
By default, dskDitto uses SHA-256 for content hashing:
- SHA-256 (
--hash sha256): conservative, widely-supported choice with strong collision guarantees. - BLAKE3 (
--hash blake3): Under many circumstances this is significantly faster on modern CPUs. However, on macOSSHA256is fine tuned and out performsBLAKE3most of the time. Thus, we leaveSHA-256as the default for now.
Examples
Scan your home directory and interactively review duplicates:
Exclude a directory (or file) from scanning:
dskDitto --exclude $HOME/Library/Caches $HOME
Exclude multiple paths in one scan (repeat --exclude):
dskDitto \ --exclude $HOME/Library/Caches \ --exclude $HOME/.cache \ --exclude $HOME/Downloads \ $HOME
List duplicates for scripting or grepping, without launching the TUI:
dskDitto --text ~/Pictures ~/Movies | grep "\.jpg$"
Find and safely delete duplicates larger than 100 MiB, keeping one copy per group:
dskDitto --min-size 100MiB --remove 1 /mnt/big-disk
Shrink a media library by converting duplicates into symlinks instead of deleting them:
dskDitto --remove 1 --link ~/MediaExport duplicate information to CSV or JSON for offline analysis:
dskDitto --csv-out dupes.csv ~/Photos dskDitto --json-out dupes.json ~/Projects
Recipes
-
Clean a downloads folder but keep one copy of each installer:
dskDitto --min-size 10MiB --remove 1 ~/Downloads -
Deduplicate a photo drive while preserving directory layout with symlinks:
dskDitto --remove 1 --link /Volumes/photo-archive
-
Hunt for big redundant media files only:
dskDitto --min-size 500MiB --text ~/Movies ~/TV
-
Use BLAKE3
NOTE: On macOS,
Blake3will actually perform worse thanSHA256hence, we leave it as default for time being.Blake3'simplementation may improve in the future, possibly out performingSHA256.dskDitto --hash blake3 --min-size 10MiB --text /mnt/big-disk
-
Feed duplicate groups into another tool via CSV:
dskDitto --csv-out dupes.csv /data
Configuration
- Log level: set
DSKDITTO_LOG_LEVELtodebug,info,warn, etc. - Default options: wrap
dskDittoin a shell alias or script with your favorite defaults. - Profiling: supply
--pprof host:portto expose Go'spprofendpoints while the tool runs.
Screenshots
dskDitto rendered as a table
TUI for interactively selecting files to remove or keep
Confirmation window keeps you from deleting the wrong files
Legacy UI shots
Development
make debug # Create development build make test # go test ./... make bench # run benchmarks (adds -benchmem) make bench-profile # capture cpu.prof and mem.prof into the repo root make pprof-web # launch go tool pprof with HTTP UI for the latest profile
Contributing
Issues and PRs are welcome. Open an issue if you have ideas for improvements, new output modes, or performance tweaks.
License
This project is released under the Apache license. See LICENSE for details.





