Settings

Theme

Show HN: Redact Anything: AI-powered PDF redaction that deletes the text

redactanything.com

2 points by hvpandya 13 days ago · 0 comments · 2 min read

Reader

I built a tool to redact sensitive information from PDFs: https://redactanything.com

  The problem: Most "redaction" tools just draw black boxes over text. The text is still in the PDF and can be recovered with basic tools. Adobe's actual redaction works but it's manual and slow.

  What this does:
  - Upload a PDF (stays in your browser, never hits a server)
  - AI detects PII: names, SSNs, emails, phone numbers, addresses, dates, etc.
  - You review what it found and approve/reject
  - Download a PDF where the text is permanently removed from the content stream

  Technical details:
  - Frontend: React + PDF.js for rendering + pdf-lib for manipulation
  - NER model: Hugging Face Transformers (Xenova/bert-base-NER) running server-side
  - OCR for scanned docs
  - The actual redaction removes text operators from the PDF content stream, not just overlays

  I built this because I needed to redact medical records and legal docs. Enterprise tools cost thousands, Adobe is $20/mo and manual, free tools are sketchy. Settled on $2.99/doc as a middle ground.

  Limitations I'm aware of:
  - AI detection isn't perfect (that's why there's a review step)
  - Doesn't handle all PDF edge cases (encrypted, malformed, etc.)
  - Names in non-Western formats need work

  Would appreciate feedback on the detection accuracy and any edge cases you find.

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection