OpenAI shipped privacy-filter, a 1.5B PII tagger you can run locally

6 min read Original article ↗

Earlier this month OpenAI published a model called privacy-filter on Hugging Face under an Apache 2.0 license. It is not a chatbot. It is not an image generator. It is a 1.5B parameter token classifier that reads through text and tags eight categories of personal information: names, emails, phone numbers, physical addresses, account numbers, dates, URLs, and what the model card calls "secrets": API keys, access tokens, passwords, and similar credentials.

That is a narrow job. It is also exactly the job you want a small, local model to do before you paste anything sensitive into a cloud LLM.

What is actually new here

Local PII detection is not new. spaCy ships with a named-entity recognizer. Microsoft Presidio has been around for years. Regex catches emails and SSNs in a few lines of Python. So the question is fair: why does another tagger matter?

A few reasons stand out.

It is genuinely open. Apache 2.0 weights, full model card, runs locally. No API key, no rate limits, no telemetry back to OpenAI. You can fine-tune it, redistribute it, or ship it inside a commercial app without negotiating a license.

It is small enough to run on a laptop. Quantized to 4 or 6 bits, it runs in a browser via WebGPU, on a Mac via Apple’s MLX framework, and on commodity x86 boxes via ONNX Runtime. No GPU cluster required. RedactDesk loads the MLX build at app launch and uses it for every redaction pass on macOS 14 and later.

It understands context, not just patterns. Regex finds a number that looks like a phone number. A classifier of this size learns that "account ending in 4421" in a bank email is a partial account number, while the same string in a recipe is not. That difference is the whole reason we stopped shipping pure regex redactors.

The intended use case is preprocessing. The model card’s own examples show it being used to sanitize text before sending it to a cloud AI service, including OpenAI’s own. That last part is the one I keep coming back to.

OpenAI shipped a tool for not sending data to OpenAI

The first practical use of privacy-filter is helping people pre-redact prompts before pasting them into ChatGPT, Claude, or Gemini. OpenAI knows this. It is in the model card.

Whether that is a clever positioning move, a regulatory hedge against EU AI Act enforcement, or a real alignment with user interest is up for debate. The artifact itself is not. The weights are public, the license is permissive, and the model works. We get to use it without having to guess at the motive.

The eight categories, and what they cover

Here is what the tagger looks for, in plain terms. RedactDesk surfaces the same eight categories in the macOS UI so you can review every span before it is burned into the PDF.

  • Names - given names, surnames, and full names of people. Not company names.
  • Emails - any email address, including ones split by line breaks or with display names attached.
  • Phone numbers - US, international, and uncommon formats. Catches numbers written as words too.
  • Addresses - street, city, postal code, country. Often the noisiest category in real documents.
  • Account numbers - bank accounts, credit cards, IBAN, policy numbers, member IDs.
  • Dates - dates of birth and other dates that can identify someone in combination with other fields.
  • URLs - personal URLs, profile links, file-share links that act as bearer tokens.
  • Secrets - API keys, OAuth tokens, passwords, AWS access keys, anything that looks like a credential.

Who this changes the math for

Before this release, "just run a local PII model" was a decent answer if you had a data engineer on staff. It was not a decent answer if you were a lawyer trying to summarize a deposition, a therapist drafting a treatment letter, or a journalist working a source document. Those workflows defaulted to either pasting raw content into a cloud chatbot and hoping, or not using AI at all.

Three groups get a real new option:

  • Lawyers, therapists, doctors, and journalists who handle privileged documents and currently cannot use cloud AI for that work.
  • Indie developers building browser extensions, IDE plugins, and document tools who can now ship a pre-redaction step without per-token API costs.
  • Engineers at companies with strict data egress rules, who can run the classifier inside the VPC and never let raw text leave.

The shared thread is trust. The "your prompts may be used as training data" concern now has a clean technical mitigation that does not require trusting any single vendor’s privacy policy. The fix runs on your machine.

What it does not do

It is not magic and it is not a replacement for review. A few honest limitations from a few weeks of using it inside RedactDesk:

  • Context-dependent PII still slips through. A first name on its own with no surrounding context can be missed. A nickname inside a quoted message is hit-or-miss.
  • False positives on common nouns. Words like "June" or "May" sometimes get tagged as dates even when they are names or month references that do not identify anyone.
  • It is not adversarial-robust. Someone who wants to trick the tagger by spelling an email as "jane (at) example dot com" can. This is a preprocessing tool, not a security control against motivated insiders.
  • Tables and forms are harder than prose. Spans that rely on column headers for meaning lose context once flattened to plain text. RedactDesk works around this by giving the model the PDF’s structural layout, not just a text dump.

The right way to use it is as a first pass that catches the obvious stuff so a human can spend their attention on the rest. That is how RedactDesk uses it: the model proposes spans, you confirm or reject in the review pane, and only then does the redaction get burned into the PDF.

How RedactDesk uses the model

RedactDesk is a free, open-source Mac app the Elephas team built around exactly this model. The flow is short:

  1. You drop a PDF onto the app.
  2. The MLX build of privacy-filter runs locally and tags every span across the eight categories.
  3. You review the proposed redactions in a side pane and toggle anything you want to keep or remove.
  4. The app burns the redactions into a new PDF. The original is untouched and nothing leaves the machine.

Then you paste the redacted PDF into ChatGPT, Claude, or Gemini with the parts that matter still readable and the parts that identify anyone replaced with solid black bars.

Try it

If you want the model itself, the weights and model card are on Hugging Face. If you want a Mac app that wraps it in a redaction workflow with review and export, download RedactDesk. Source is on GitHub under MIT.

For a free baseline that runs on your own hardware and ships under a license that does not get in the way, this is a serious release.