Filtering spam with GPT4o-mini for $0.00008 per email
It doesn’t make sense to pay for spam filtering anymore

I self-host my mail, but I get flooded with spam. I run 4 mail exchangers, all with Postfix + RSpamD. Here’s a look at recently blocked junk on one of my inbound relays:

These were blocked by a mix of:
- old-school blacklists
- looking for currency in the subject line
- fuzzy matching against online databases
- bad reverse DNS
- ASN-level blocking to deny spammy hosts like ColoCrossing from talking to my infra
All old school stuff that usually works. But a lot was still getting through:

These were usually from trash gTLDs like .digital and .today with good DKIM and DMARC, but I don’t want to block all gTLDs.
RSpamD + GPT
RSpamD 3.9 comes with a GPT module. I created an API key on OpenAI and enabled the module on all four of my MX servers. I did this by adding the below to /etc/rspamd/local.d/gpt.conf:
YAML
# Managed by Ansible role `rspamd`
enabled = true;
type = "openai";
api_key = "xxx";
model = "gpt-4o-mini";
max_tokens = 500;
temperature = 0.5;
timeout = 10s;
autolearn = true;
top_p = 0.8;
url = "https://api.openai.com/v1/chat/completions";I now have RSpamD assigning a weight from -2 to +5 from a call to GPT4o-mini with the message:

My token usage per email is incredibly tiny (about 500 tokens per email):

Let’s say tokens are 97% input vs 3% output, leading to an average price of $0.1635/1M tokens on GPT4o-mini given current OpenAI pricing.
Those 1M tokens buys me about 2,000 email scans (500 tokens per message), or $0.00008175 per email.
For a large firm handling 5k emails per day, using RSpamD + GPT will cost you $149/year, albeit there’s no nice GUI for IT to release quarantined emails from. But it does highlight how expensive the SaaS space is for anything “AI security” related, when it’s functionally increidbly simple to do behind the scenes.
Perhaps Mimecast and others could elaborate on their “AI”, because if you are solely trying to combat spam and phishing, I think RSpamD + OpenAI calls is a viable solution.

“Contact sales”? Per-user pricing? No thanks.