Why you shouldn’t use Redis as a rate limiter: Part 1 of 2

15 min read Original article ↗

Andrew Miller

(This is the first of a two part series on rate limiting with Redis. This part will look at possible implementations, and the second part will look at performance)

My colleague Eric has informed me that many companies are now using Redis to implement rate limiting, and has witnessed serious businesses doing this, first hand. “Redis?”, I thought. “Isn’t that that thing to cache your slow HTTP page generations?”. I was curious to how or why people are implementing this using an in-memory database so I investigated the issue myself. I spent two days learning Redis then contemplated how one could use its primitives to implement a rate limiter in an effective way.

I only set out to show the slowness of a performance-critical algorithm built on a database, but after a while I realized that none of the Redis solutions I came up with actually looked very solid, and became frustrated as neither did the solutions found on Google. The code I’ll show is Rust but you can skip it because I provide a listing of equivalent plain commands you can run in redis-cli after each Rust snippet.

I first came up with this code:

Attempt 1: Fixed Window

fn up_rate_idiom(redis_con : &mut redis::Connection) -> RedisResult<u32> {
let opts = redis::SetOptions::default()
.conditional_set(redis::ExistenceCheck::NX)
.with_expiration(redis::SetExpiry::EX(1));
let () = redis_con.set_options("rate", 0u32, opts)?;
redis_con.incr("rate", 1u32)
}

This is equivalent to the Redis commands:

SET rate 0 NX EX 1
INCR rate

Here’s what the code does:

  1. Atomically create a key called “rate” with the value 0, and expiry in one second from now. If the “rate” key already exists at this moment, it is not created or modified.
  2. Increment the value of the “rate” key, and return the new value.

The documentation of INCR, and every Redis command used in this article can be found here.

Your clients (such as HTTP front end servers) would invoke these Redis commands each time a subject who should be rate limited makes a request (you can have keys with different names for different subjects, e.g. “rate:user:john”, “rate:user:bob”, “rate:users”, “rate:admins”).

Seems elegant and idiomatic to me. We don’t have to manually manage time because we can make the value holding the rate automatically expire after a second — maybe that’s why people like Redis for this application. The first downside is a client could submit all his requests for a given second at the end of that second, and then submit all his requests for the next second at the start of that next second, which is not what we want.

The code is completely broken: it issues the two commands one by one to Redis, over the network. By the time the INCR arrives — and the higher the latency the more likely this is — the key could have expired. When the INCR arrives, it will then create the key, with no expiry. And then when step 1 executes again, it will not set the expiry either, because the key already exists. The “rate” value will just increase forever.

This can be fixed with a transaction¹:

let opts = redis::SetOptions::default()
.conditional_set(redis::ExistenceCheck::NX)
.with_expiration(redis::SetExpiry::EX(1));
- let () = redis_con.set_options("rate", 0u32, opts)?;
- redis_con.incr("rate", 1u32)
+ let (k,) = redis::pipe()
+ .atomic()
+ .set_options("rate", 0u32, opts).ignore()
+ .incr("rate", 1u32).query(redis_con)?;
+ Ok(k)
}

This is equivalent to the Redis commands:

MULTI
SET rate 0 NX EX 1
INCR rate
EXEC

Now the INCR succeeds if and only if the potential expiry setting succeeds.

This is already a second red flag. We needed to use a transaction to do something that’s supposedly trivial for a given database product. The library we’re using here also does not support transactions in a cluster (other clients do, e.g. the Python redis library), but that’s beside the point.

The first red flag was that we didn’t even consider implementing the gold standard token bucket algorithm, because that would require two variables. In fact, doing any real algorithm here would just be like taking an algorithm that works perfectly fine and moving its variables into the cloud. I briefly considered token bucket, but did not see any clear way to do this idiomatically. “But Lua”, you say? We’ll get to that.

¹ This actually works because time is “paused” at the start of the transaction. Otherwise it would have had a second way of the same bug happening.

Attempt 2: Fixed Window (time-based key)

The next attempt is another fixed window, but in line with what most people on the web are suggesting. You name your key based on the current time and INCR it, then set it to EXPIRE some safe amount of time after the window. Thus you have one new key every window so the previous window malfunctioning wont affect us.

fn up_rate_timekey(redis_con : &mut redis::Connection) -> RedisResult<u32> {
use std::time::{SystemTime, UNIX_EPOCH};
let time = SystemTime::now().duration_since(UNIX_EPOCH).unwrap();
/* Window period of 3 seconds. We multiply it by 3 after the divide
to make it still be a valid UNIX timestamp, for easier analysis. */
let buf = format!("rate_{}", time.as_secs() / 3 * 3);
let rate : u32 = redis_con.incr(&buf, 1)?;
let _ : u32 = redis_con.expire(&buf, 10)?;
Ok(rate)
}

Equivalent Redis commands:

INCR rate_1754317189
EXPIRE rate_1754317189 10

Then for the sake of performance they make it so only the first client to set the key runs the EXPIRE command:

fn up_rate_timekey(redis_con : &mut redis::Connection) -> RedisResult<u32> {
use std::time::{SystemTime, UNIX_EPOCH};
let time = SystemTime::now().duration_since(UNIX_EPOCH).unwrap();
let buf = format!("rate_{}", time.as_secs() / 3 * 3);
let rate : u32 = redis_con.incr(&buf, 1)?;
- let _ : u32 = redis_con.expire(&buf, 10)?;
+ if rate == 1 {
+ let _ : u32 = redis_con.expire(&buf, 10)?;
+ }
Ok(rate)
}
INCR rate_1754317189
(if INCR returned 1, also execute the following)
EXPIRE rate_1754317189 10

This feels really bad because requests made at the same time by clients with differing latencies and clock offsets¹ could land in different windows. I worked out the table of use cases vs hazards, and it seems most use cases are actually fine (for very subtle reasons), but that still took many hours.

If the EXPIRE command fails to reach Redis (the Redis server or client crashes, network cuts out, etc.), the key will never expire. This wont happen too commonly (except when your network is maxed) so it’s not likely to kill your memory, but it’s still a big red flag.

For implementations that just use the current minute to name the key (e.g. rate_48), when an EXPIRE fails and the minute cycles back an hour later, you could have a doubled rate for that entire minute. This is where the typical engineer says, “oh! that bug’s because of cosmic rays”. No, it’s because you built your rate limiter on strange database operations, using a recipe from Google.

¹ offsets from a local clock to the time of an NTP server or atomic clock. Even with working NTP it’s said to be 10s of ms or even 100ms in some cases [NTP FAQ, 2012] [NTP, 2022].

“Sliding log window”

Many websites suggested this one: You log the timestamp of every request, and constantly delete the ones older than your chosen period of time (the window). I’m not even going to implement this because it won’t have good performance. Redis has ordered sets which are a useful data structure for this. Items are ordered by score, so you set the score of each item to the current time. You have an operation, ZREMRANGEBYSCORE, which can be used to delete all the items that are older (lower score) than the window.

It can be done without race conditions, at least.

($NOW is the current UNIX time)
($T is the current time minus the window. e.g. $NOW - 1)
($UNIQUE is a unique value)

ZREMRANGEBYSCORE rate 0 $T

(returns the current rate)
ZCARD rate

(if the request was rejected, don't run the next command,
or a malicious subject could use up all your RAM. Of course
now you need more round trips)
ZADD rate $NOW $UNIQUE

$UNIQUE is the identity of the element you’re adding to the set. Some authors just use the UNIX time for it, which is a bug because then you’ll get tons of duplicate entries the moment someone starts hammering your service. All but one of the duplicates will be discarded and the allowed rate will be much higher than the limit.

Like with fixed windows, you still can’t enforce a true rate with the “sliding log window”.

The problem with windows

Imagine the simplest website — a forum — that takes 200 milliseconds to generate a page. For simplicity let’s say the server has no multithreading and runs in just one process. The popular pages are cached, so they don’t have to be generated, but users will often browse deep into the forum and open lots of uncached posts; for instance, posts made 5 years ago about whatever topic they’re researching. These users doing research are likely to press Control + click on 10 or more posts to open them in new tabs in a short period of time, and then look at them one by one after. So limiting users to 60 requests / minute (or one request every second) sounds reasonable.

With a window (fixed or sliding), a user could request 60 pages in one short burst. Perhaps they’re using userscripts or restoring their browser session. This would tie up the server for 60 × 200 milliseconds, which is 12 seconds. No other user would be able to view uncached pages for the next 12 seconds.

Basically, a token bucket would enforce the actual rate: Users would only be able to make one request every second. But, it allows configurable amount of burst. The token bucket has a parameter b, which if set to 5 would allow the user to make 5 requests in one short burst every once in a while, which would allow the desirable use cases above (opening a bunch of links in new tabs) while blocking the “sorry, we can’t handle that” one (fetching tons of old posts in a script).

That’s not a thorough analysis, but I think it’s safe to say that even with the hopefully useful O(N) memory overhead of the “sliding log window”, you’re still not getting any advantage over a token bucket; you just chose it because it’s what can be conveniently implemented on top of Redis.

There’s also potential incentives for users to use all their remaining allowance at once per window, if we think to more broad use cases like API rate limiting. If your subjects’ windows all start and end at the same time, you could have request herding in at the starts and ends of each window.

Token Bucket and Lua scripts

It seems there’s no way to do a token bucket with the main Redis primitives such as SET, EXPIRE, and INCR. It would require two variables and the client has to read one before choosing how to update the second, which would require pausing the whole database while a client carries this out. So what people do is execute code in Redis. This is done using modules or scripts. Using a module here is dubious: You’re now just loading a C program (shared library) into Redis; why not just load it into an actual computer? So let’s look at “scripts” instead. Scripts are pieces of Lua code executed in Redis.

The database is paused while a script runs. Scripts call Redis commands by writing redis.call("the command", ...). They always have an ARGV that contains parameters passed to the script invocation command (EVAL or EVALSHA), and KEYS that contains the list of keys it should use. Here’s an example of what scripts look like:

local x = tonumber(redis.call("GET", KEYS[1]))
redis.call("SET", KEYS[1], x * 100)
local y = redis.call("INCR", KEYS[1])
redis.call("SET", KEYS[2], ARGV[1])
redis.call("EXPIRE", tonumber(KEYS[2]))

Now it’s probably time to ask ourselves why we are here. We wanted a rate limiter and now we’re learning a niche programming language just to execute some code in a database that has nothing relevant to our task but a roundabout way of storing an int into RAM.

This concludes my quest to discover how or why rate limiters are implemented in Redis. As you can see, any easy way to implement a rate limiter on top of Redis is shaky at best. The only acceptable solution is to write the whole thing in a Lua script, in a way that defeats the purpose of Redis. Perhaps there are pragmatics, such as being convenient if you already have Redis set up, or your cloud platform is strongly geared toward doing things this way, but that’s it. Stay tuned for PART TWO where I’ll demonstrate that having Redis involved in a rate limiter at all is losing you 10x performance.

Appendix 1: Every solution on Google is bad

To demonstrate how absolutely frustrating it was to even find consensus on how Redis-based rate limiters are implemented, I made¹ a table of the top Google results for “redis rate limiter”, listing the types of their rate limiters and their problems. Many have major defects, and I list them in Appendix 2.

After going through all 25 of these, I just realized that most or all of them fail to support Redis clusters. That would require using {} notation in key names to ensure multiple keys to be used in a transaction are placed on the same shard. But we will see in PART TWO that Redis clusters on average hardware can only handle around 100K requests / second / counter anyway.

Terminology used in the tables:

  • Synchronized time windows: Start and ends of windows of all subjects line up as detailed in “The problem with windows
  • Lost EXPIRE bug: As detailed in “Attempt 2: Fixed Window (time-based key)

Press enter or click to view image in full size

Google results for “redis rate limiter”

Press enter or click to view image in full size

Interesting posts I found through other means

Press enter or click to view image in full size

Official sources I could find from perusing redis.io

¹ Region set to US, with personalization of search results disabled, with only web results, crossed out the videos and small personal posts. Anything that looks like an official Redis post was crossed out. I made a separate table after, just analyzing the foremost official Redis posts I could find through perusing redis.io Image of results

Appendix 2: Defects in published solutions

Many of the analyzed solutions had major defects.

Defect in [PEAKSCALE, 2017]

ZADD is used even when the rate is exceeded, so clients can use unlimited amount of memory and kill Redis.

Defect in [CLASSDOJO, 2015]

A user can use up the entire database’s memory by spamming requests, because an entry is made in the ordered set even when the request is rejected.

Defect in [REDISIO-JAVA-REA-LUA]

Every time a subject makes a request, the expiry is extended and he will eventually be falsely blocked. Their code:

-- rateLimiter.lua
local key = KEYS[1]
local requests = tonumber(redis.call('GET', key) or '-1')
local max_requests = tonumber(ARGV[1])
local expiry = tonumber(ARGV[2])

if (requests == -1) or (requests < max_requests) then
redis.call('INCR', key)
redis.call('EXPIRE', key, expiry)
return false
else
return true
end

Defect in [INFOWORLD, 2017]

A herd of requests will all pass line 4 and all be allowed. Their code:

     1         public boolean arrival(String cell){
2
3 // check if the last message exists.
4 long ttl = jedis.ttl(key);
5 if(ttl > 0){
6 return false;
7 }
8
9 // The key lives through the period defined by the interval
10 if(key != null && cell != null){
11 jedis.setex(key, interval, cell);
12 return true;
13 }
14 return false;
15 }

Defect in [REDISIO-RL]

Their code:

     1  def request_is_limited(red: Redis, redis_key: str, redis_limit: int, redis_period: timedelta):
2 if red.setnx(redis_key, redis_limit):
3 red.expire(redis_key, int(redis_period.total_seconds()))
4 bucket_val = red.get(redis_key)
5 if bucket_val and int(bucket_val) > 0:
6 red.decrby(redis_key, 1)
7 return False
8 return True

Aside from the common lost EXPIRE bug, it has a race condition: A subject will be blocked forever whenever this sequence of events happens:

  • Line 4: the key is read into bucket_val and is positive
  • Line 5: branch taken
  • The key expires
  • Line 6: the key is decremented, which creates it
  • The setnx on line 2 will never set the key again, and never return True again, and so line 3 will never execute again

Appendix 3: Development Environment

For the code I wrote, I ran it with Redis 8.0.3 [https://redis.io/docs/latest/operate/oss_and_stack/install/install-stack/apt/] on Ubuntu 24 on a c5.large AWS instance. I used rustc 1.87.0 and the crate redis 0.32.4.

Errata

“Defect in [LYFT, 2017]”

(Corrected 2025–08–06) I mistakenly thought there was a major defect in this code: that EXPIRE would be rerun every time a request is made, delaying the expiry of the key. It turns out the key is named based on the current time — they are using the “fixed window (time-based key)” method, so I was wrong. The code I looked at:

*pipeline = client.PipeAppend(*pipeline, result, "INCRBY", key, hitsAddend)
*pipeline = client.PipeAppend(*pipeline, nil, "EXPIRE", key, expirationSeconds)

“Defect in [REDISIO-DOTNET-FW]”

(Corrected 2025–08–06) I made the same mistake here as with [LYFT, 2017]. The code I looked at:

 local key = KEYS[1]
local max_requests = tonumber(ARGV[1])
local expiry = tonumber(ARGV[2])
local requests = redis.call('INCR',key)
redis.call('EXPIRE', key, expiry)
if requests < max_requests then
return 0
else
return 1
end

References

(oldest noticed version used as date)

[NTP FAQ, 2012] https://www.ntp.org/ntpfaq/NTP-s-algo/#5131-how-accurate-will-my-clock-be

[NTP, 2022] https://www.eecis.udel.edu/~mills/exec.html

[NPM-RLR, 2016] https://www.npmjs.com/package/rate-limit-redis

[GO-RR, 2015] https://github.com/go-redis/redis_rate

[UPSTASH, 2022] https://upstash.com/docs/redis/sdks/ratelimit-ts/overview

[INNOQ, 2024] https://www.innoq.com/en/blog/2024/03/distributed-rate-limiting-with-spring-boot-and-redis/

[PY-RRL, 2023] https://pypi.org/project/redis-rate-limiters/

[ASPNETCORE-RRL, 2022] https://github.com/cristipufu/aspnetcore-redis-rate-limiting

[RAMP, 2022] https://builders.ramp.com/post/rate-limiting-with-redis

[LEVELUP, 2020] https://levelup.gitconnected.com/implementing-a-sliding-log-rate-limiter-with-redis-and-golang-79db8a297b9e

[REDHAT, 2024] https://developers.redhat.com/articles/2022/03/29/develop-basic-rate-limiter-quarkus-and-redis

[PEAKSCALE, 2023] https://www.peakscale.com/redis-rate-limiting/

[FIGMA, 2017] https://www.figma.com/blog/an-alternative-approach-to-rate-limiting/

[LYFT, 2017] https://eng.lyft.com/announcing-ratelimit-c2e8f3182555

[STRIPE, 2017] https://stripe.com/blog/rate-limiters

[CLASSDOJO, 2015] https://engineering.classdojo.com/blog/2015/02/06/rolling-rate-limiter/

[REDISIO-INCR] https://redis.io/docs/latest/commands/incr/

[REDISIO-JAVA-FW] https://redis.io/learn/develop/java/spring/rate-limiting/fixed-window

[REDISIO-JAVA-REA] https://redis.io/learn/develop/java/spring/rate-limiting/fixed-window/reactive

[REDISIO-JAVA-REA-LUA] https://redis.io/learn/develop/java/spring/rate-limiting/fixed-window/reactive-lua

[INFOWORLD, 2017] https://www.infoworld.com/article/2257527/how-to-use-redis-for-real-time-metering-applications.html

[REDISIO-RL] https://redis.io/learn/howtos/ratelimiting

[REDISIO-GLOSS] https://redis.io/glossary/rate-limiting/

[REDISIO-DOTNET-SW] https://redis.io/learn/develop/dotnet/aspnetcore/rate-limiting/sliding-window

[REDISIO-DOTNET-FW] https://redis.io/learn/develop/dotnet/aspnetcore/rate-limiting/fixed-window

[REDISIO-DOTNET-MIDL] https://redis.io/learn/develop/dotnet/aspnetcore/rate-limiting/middleware

[REDISIO-GCRA] https://redis.io/blog/redis-cell-rate-limiting-redis-module/