GitHub - mjgil-rust/ntfs-recover: worked on my drive

8 min read Original article ↗

NTFS data recovery tool that brute-force scans raw disks for MFT records, reconstructs the directory tree, and extracts files — even when the Master File Table is corrupted and the volume can't be mounted.

The Problem

When an NTFS volume's MFT is corrupted — both the primary $MFT and its mirror $MFTMirr — standard tools like chkdsk, ntfsfix, and TestDisk can't repair or mount it. But individual MFT file records are scattered across the disk and are usually still intact. This tool finds them.

How It Works

Recovery happens in four phases:

  1. Scan — Reads the entire raw disk at every 1024-byte boundary looking for FILE signatures. Validates each candidate record, applies fixup arrays, and saves results to a resumable .rntfs index file.

  2. Rebuild — Parses the $FILE_NAME and $DATA attributes from every found record. Reconstructs the full directory tree by following parent references. Handles duplicate records, orphaned files, extension records for large/fragmented files, and circular references.

  3. Extract — Reads file data from the raw disk using decoded data runs (or copies resident data directly from the MFT record for small files). Writes everything to an output directory preserving the original folder structure.

  4. Repair — Reconstructs the $MFT's own data run list from scan results, then patches it in-place into the existing MFT record 0 on disk. Only the data runs are replaced — all other attributes ($ATTRIBUTE_LIST, $BITMAP, $FILE_NAME, $STANDARD_INFORMATION) are preserved. The tool automatically detects when the MFT spans multiple records (via $ATTRIBUTE_LIST pointing to extension records like record 15), and only reconstructs the portion belonging to record 0. Once the data runs are valid again, ntfsfix can correct the $MFTMirr and the volume becomes mountable.

Each phase saves its state so the expensive scan doesn't need to be repeated. Scanning is resumable — if interrupted, it picks up where it left off.

Install

The binary is at target/release/rust-ntfs-recover. No external libraries or FUSE mounts required.

Usage

Full recovery pipeline

# Step 1: Scan the raw partition for MFT records
sudo rust-ntfs-recover scan /dev/nvme1n1p1

# Step 2: Browse what was found
rust-ntfs-recover list /dev/nvme1n1p1

# Step 3: Extract everything
sudo rust-ntfs-recover extract /dev/nvme1n1p1 -o /mnt/recovered

One-shot recovery

# Scan + rebuild + extract in one command
sudo rust-ntfs-recover extract /dev/nvme1n1p1 -o /mnt/recovered

Selective recovery

# Recover only photos and videos
sudo rust-ntfs-recover extract /dev/nvme1n1p1 -o /mnt/recovered \
    -p "*.jpg" -p "*.png" -p "*.mp4" -p "*.mkv"

# List all files over 1MB
rust-ntfs-recover list /dev/nvme1n1p1 --min-size 1M

# Dry run — see what would be extracted without writing anything
sudo rust-ntfs-recover extract /dev/nvme1n1p1 -o /mnt/recovered --dry-run

# Flatten output (no subdirectories)
sudo rust-ntfs-recover extract /dev/nvme1n1p1 -o /mnt/recovered --no-dirs

# Include deleted files
sudo rust-ntfs-recover extract /dev/nvme1n1p1 -o /mnt/recovered --include-deleted

Repair the MFT (make volume mountable again)

The --reconstruct flag tells the repairer to build a new data run list from the scan results rather than using the (corrupted) runs already in record 0:

# Dry run first — see the fragment comparison and what would be written
sudo rust-ntfs-recover repair /dev/nvme1n1p1 --reconstruct --dry-run

# Repair record 0 (backs up original automatically)
sudo rust-ntfs-recover repair /dev/nvme1n1p1 --reconstruct

# Custom backup path
sudo rust-ntfs-recover repair /dev/nvme1n1p1 --reconstruct \
    --backup /safe/location/mft_record0_original.bin

# Also fix the $MFTMirr
sudo rust-ntfs-recover repair /dev/nvme1n1p1 --reconstruct --fix-mirror

After repair, run ntfsfix to sync the mirror and clear the journal, then mount:

sudo ntfsfix /dev/nvme1n1p1
sudo mount -t ntfs-3g -o ro /dev/nvme1n1p1 /mnt/recovered

How repair works

The repairer patches the existing record 0 in-place. It reads the record from disk, locates the unnamed $DATA attribute, replaces only the data run bytes within that attribute, and writes the record back. All other attributes ($ATTRIBUTE_LIST, $BITMAP, $FILE_NAME, $STANDARD_INFORMATION) and the record header are preserved. This is critical for MFTs that span multiple records via $ATTRIBUTE_LIST — the tool detects the VCN boundary from record 0's $DATA header and only reconstructs fragments within that range.

Restoring from backup

The backup is a raw 1024-byte copy of the original record 0. To restore:

# MFT record 0 is at byte offset 0x4000 (cluster 4 * 4096)
sudo dd if=mft_record0_backup.bin of=/dev/nvme1n1p1 bs=1024 seek=16 conv=notrunc

Works with disk images too

rust-ntfs-recover scan /path/to/partition.img
rust-ntfs-recover extract /path/to/partition.img -o ./recovered

Command Reference

rust-ntfs-recover [OPTIONS] <COMMAND>

Commands:
  scan        Scan disk for MFT records
  list        List found files (requires prior scan)
  extract     Extract files to output directory
  repair      Repair $MFT record 0 using scan results

Global Options:
  -s, --scan-file <FILE>         Save/load scan results [default: ./scan.rntfs]
      --cluster-size <BYTES>     Override cluster size auto-detection
      --threads <N>              Scanning threads [default: num_cpus]
  -v, --verbose                  Verbose output

Extract Options:
  -o, --output <DIR>             Output directory [default: ./recovered]
  -p, --pattern <GLOB>           File pattern filter (repeatable)
      --min-size <SIZE>          Minimum file size (e.g., 1K, 1M, 1G)
      --max-size <SIZE>          Maximum file size
      --list-only                List matching files without extracting
      --no-dirs                  Flatten output directory structure
      --include-deleted          Include deleted file records
      --dry-run                  Show what would be extracted

Repair Options:
      --reconstruct              Rebuild data runs from scan results (required for corrupted runs)
      --dry-run                  Show what would be written without modifying disk
      --backup <PATH>            Custom path for record 0 backup
      --fix-mirror               Also repair $MFTMirr

Architecture

~3,500 lines of Rust across five modules and a custom NTFS parser:

src/
├── main.rs              CLI (clap) and phase orchestration
├── scanner.rs           Phase 1 — raw disk scan with parallel chunk processing
├── rebuilder.rs         Phase 2 — MFT record parsing and directory tree reconstruction
├── extractor.rs         Phase 3 — file data extraction using data runs
├── repairer.rs          Phase 4 — MFT record 0 reconstruction and disk write
├── types.rs             Shared data structures (ScannedRecord, FileEntry, DirectoryTree)
└── parser/
    ├── boot_sector.rs   NTFS boot sector / BPB parsing (cluster size detection)
    ├── mft_record.rs    MFT record header validation and parsing
    ├── fixup.rs         Update sequence array (fixup) application
    ├── attributes.rs    Attribute header iteration (resident + non-resident)
    ├── filename.rs      $FILE_NAME attribute parsing (0x30)
    ├── data_runs.rs     Data run decoding and encoding (run-length encoded cluster maps)
    ├── standard_info.rs $STANDARD_INFORMATION parsing (timestamps)
    ├── attribute_list.rs $ATTRIBUTE_LIST parsing (extension records)
    └── error.rs         ParseError type and bounds-checked read helpers

Why a custom parser?

The existing Rust ntfs crate expects a well-formed, mountable NTFS volume — it reads the boot sector, locates $MFT via its data run list, and traverses top-down. That's exactly the path that's broken when the MFT is corrupted. This tool needs to:

  • Scan raw disk at every 1024-byte boundary (no valid MFT entry point)
  • Parse records that might be partially corrupted (best-effort, not strict validation)
  • Skip broken attributes and continue with the next one
  • Handle negative absolute offsets in data runs gracefully (stop and return partial results)

All parsing takes &[u8] slices with bounds checking everywhere. No unsafe code. No panics on untrusted data.

Scan File Format

Scan results are saved in a custom binary format (.rntfs, version 4) with a 64-byte header containing disk geometry and scan progress, followed by an MFT fragment table and length-prefixed bincode batches of records. This enables:

  • Resumable scans — if interrupted, the scan continues from the last flushed offset
  • Separate phases — scan once, then list/extract as many times as needed
  • Progress tracking — the header stores how far the scan has gotten

Performance

  • Scans in 64 MB chunks using parallel processing (rayon) across 1 MB sub-chunks
  • MFT fragment map parsed from $MFT record 0 for accurate logical record number resolution
  • Files sorted by first data run cluster before extraction to minimize disk seeking
  • Progress bars with ETA via indicatif
  • Targets ~500 MB/s sequential read throughput on NVMe

Edge Cases Handled

  • Duplicate MFT record numbers (prefers in-use over deleted, then highest sequence number)
  • Orphaned files with missing parents (placed under __ORPHANED__/)
  • Circular parent references (detected and broken with __CYCLE__/)
  • Resident data (small files stored directly in the MFT record)
  • Sparse data runs (zero-filled regions)
  • Extension records for large/fragmented files spanning multiple MFT entries
  • Compressed and encrypted file flags (detected and flagged, not decoded)
  • Filename conflict resolution (appends _1, _2, etc.)
  • Filename sanitization for the output filesystem
  • Corrupted boot sector (falls back to defaults: 512-byte sectors, 4096-byte clusters)
  • Truncated or corrupted scan files (reads as many valid batches as possible)

Limitations

  • Does not decompress NTFS-compressed files (LZNT1)
  • Does not decrypt EFS-encrypted files
  • Does not parse $INDEX_ROOT / $INDEX_ALLOCATION B-trees (rebuilds directories from parent references instead)
  • No $LogFile journal replay or USN journal analysis

Dependencies

Crate Purpose
memmap2 Memory-mapped I/O
rayon Parallel scanning
clap CLI argument parsing
indicatif Progress bars
serde + bincode + serde_json Serialization (scan cache + tree export)
glob File pattern matching
anyhow Error handling
log + env_logger Logging

Notably absent: ntfs crate, byteorder (uses stdlib from_le_bytes), nom/binrw (manual parsing for better error recovery control).

Testing

# Unit tests (parser, data runs, size parsing, filename sanitization)
cargo test

# Integration test image (requires root for mount)
sudo ./tests/create_test_image.sh /tmp/ntfs_test.img
sudo cargo run --release -- scan /tmp/ntfs_test.img
cargo run --release -- list /tmp/ntfs_test.img
sudo cargo run --release -- extract /tmp/ntfs_test.img -o /tmp/ntfs_recovered

License

See repository for license details.