frecenfile
frecenfile computes frecency scores for files in Git repositories. Frecency combines the frequency and recency of events.
This is useful as a heauristic for finding relevant or trending files when all you have to work with is the commit history.
Performance
frecenfile is highly scalabe, producing a sorted output within miliseconds for mid-sized repositories, and processing the entire commit history Linux in under a minute. Processing the last 3000 commits in the Linux repository takes just around a second.
For most purposes, the results should be easily cacheable.
Cache
frecenfile stores a per-repo cache in the OS cache directory. You can override the location
with FRECENFILE_CACHE_DIR. If the cache directory is not writable, frecenfile falls back to a
temporary cache or no-cache mode instead of failing.
Git history
By default, frecenfile processes the last 3000 commits, but this can be modified using the --max-commits
flag. Processing an excessive amounts of commits would not usually be usueful, as "trending" files
are not likely to be buried deep in the commit history. Processing only a smaller amount of commits is not
likely to be needed for performance reasons, but might be useful for some use cases.
📦 Installation
🚀 Usage
Score every file in the current repo, highest first
Only list paths, omit scores
Restrict analysis to certain directories
frecenfile --paths src tests
Sort oldest/least-touched files first
Example output
12.9423 src/lib.rs
9.3310 src/analyze.rs
2.7815 README.md