Building an idempotent email API with River unique jobs
riverqueue.comIdempotency is so under rated, all APIs should support it in my opinion.
If you only need this for transactional email, you could use MailPace, we support idempotency out of the box: https://mailpace.com/features/idempotent-email-api
Email providers don’t usually provide idempotent API endpoints because they have no way to enforce that the email will be sent exactly once downstream.
The best you can do with email is at least once (and you can’t even really do that because of spam), so you should just build that assumption into your app. If 1 out of 100,000 emails gets sent 2x no one is going to care (as long as you include information the user can use to dedupe manually).
Maybe adding an idempotent send endpoint reduces that from 1 /100,000 to 1/150,000 or whatever, but is that worth the extra complexity? Is dropping your duplicate email rate from some very small number to some smaller number going to make enough money to pay for the extra dev time, and the extra lifetime maintainability burden? Almost certainly not.
You could make that argument for lots of services that have external side effects, but that’s about what happens after the service has been asked to do a thing (to send an email in this case).
However just because an action may be duplicated after the provider has been asked to do a thing, it does not eliminate the value of the provider being able to deduplicate that incoming request and avoiding multiple identical tasks on their end. Without API level idempotency, a single email on the client’s end could turn into many redundant emails at the service provider’s side, each of which could then be subject to those same subsequent duplications at the SMTP layer. And even then, providers can use the Message-Id header to provide idempotency in delivery as many do.
This is an unavoidable consequence of distributed systems where the client may not know if the server ever received or processed the request, and it may also occur due to client-side bugs or retries within their own software.
In other words, API level idempotency can help eliminate all duplication prior to the API; depending on the service, the provider may also be able to eliminate duplication afterward as well. So it’s strictly better than not having it, really not that difficult to implement, and makes it easier for integrators to build a robust integration with you.
> makes it easier for integrators to build a robust integration with you
No, don't say 'easier'. It makes it possible to build a robust integration. We need to stop with this notion that omiting idempotency from an API just makes things "more difficult" to develop. Without idempotency, you garuantee that the resulting system is "difficult" to use and full of nasty issues that are waiting for the right conditions to collapse the entire house of cards you've built.
So many SaaS providers have never even heard of idempotency, let alone design it into their APIs. Many people believe you can just sprinkle it on as a library without having to think about it.
All APIs with multiple distributed servers must support idempotency. Refuse to do business with any organisations who do not design this into their APIs!
Hah, I agree, "easier" is too soft :)
> Without API level idempotency, a single email on the client’s end could turn into many redundant emails at the service provider’s side, each of which could then be subject to those same subsequent duplications at the SMTP layer.
Ok so now there’s a 1/100 million chance that the client gets 3 duplicate emails.
I’m not arguing that idempotency is never important. The most popular blog post I’ve ever written is about the 2 generals problem and how idempotency can help.
I’m arguing in this specific instance it doesn’t matter.
As far as I was aware duplicate message-id headers aren’t deduped by every client, but if they are being used for deduplication just expose that in your api and let the caller set it.
> as long as you include information the user can use to dedupe manually
There is an existing standard for this in the form of the message-id header.
I'm confused by this discussion as this back/forth makes it sound like this is some kind of little-known e-mail feature, but every single e-mail must have this unique-id, so how is this problem even coming up in the first place?
The last time I looked into this, which was admittedly a long time ago, message-id wasn’t used by all clients for deduping and you had to assume that it wouldn’t be, so you had to include manual deduping information—-order number etc…
If you don't generate the message ID far enough upstream, the "same" email will be sent with two different unique IDs.
AFAIK, SMTP requires the message-id header, so this "upstream" of which you speak must be outside of the scope of email infrastructure. In this case, looking at this article, the API for sending an email using broken-apart components--a body and recipients and subject--is missing what should be a required parameter for at least a fragment of the message-id, if not the entire thing. Like, this code must be generating a new/random message-id internally for the messages it constructs for them to even be sendable... that, or the API it is itself using to send mail is itself broken by not exposing this detail.