GitHub - ankur-anand/taskwheel: Hierarchical timing Wheel For Go, Manage Millions of timers with O(1), Operations.

taskwheel

A high-performance, generic Hierarchical Timing Wheel implementation in Go for efficient timer management at scale.

Why TaskWheel?

When you need to manage thousands or millions or timers (timeouts, TTLs, scheduled tasks) etc. While Go's standard time.Timer is excellent, it faces some challenges at high volumes.

Example: How a Timing Wheel Efficiently Expires 10 Million TTL-Based Cache Keys

Traditional cache cleanup scans every key (O(n)), which works fine at small scale — but completely breaks down once you hit millions of entries.

Using a Timing Wheel, expiration becomes O(1) per tick — processing only the keys that are actually due right now.

Read Stall Comparison (10 Million Keys)

Metric	Naive Scan	Timing Wheel
Avg Read Latency	4.68 ms	3.15 µs
Max Read Stall	500 ms	≈ 2 ms

At 10 million keys, a naive cleanup can stall reads for seconds — while the Timing Wheel glides through them in microseconds.

Read the full story on Medium:
Killing O(n): How Timing Wheels Expire 10 Million Keys Effortlessly in Golang

Installation

go get github.com/ankur-anand/taskwheel

Quick Start

package main

import (
	"fmt"
	"time"

	"github.com/ankur-anand/taskwheel"
)

func main() {
	// Create hierarchical timing wheel
	intervals := []time.Duration{10 * time.Millisecond, 1 * time.Second}
	slots := []int{100, 60}
	wheel := taskwheel.NewHierarchicalTimingWheel[string](intervals, slots)

	// Start the wheel
	stop := wheel.Start(10*time.Millisecond, func(timer *taskwheel.Timer[string]) {
		fmt.Printf("Timer fired: %s\n", timer.Value)
	})
	defer stop()

	// Schedule timers
	wheel.AfterTimeout("task1", "Process payment", 100*time.Millisecond)
	wheel.AfterTimeout("task2", "Send email", 500*time.Millisecond)

	time.Sleep(1 * time.Second)
}

High-Throughput Usage (10,000+ timers/sec)

For production systems with high timer volumes, use StartBatch() with a worker pool:

package main

import (
	"fmt"
	"runtime"
	"sync"
	"time"

	"github.com/ankur-anand/taskwheel"
)

func main() {
	intervals := []time.Duration{10 * time.Millisecond, 100 * time.Millisecond, time.Second}
	slots := []int{10, 100, 60}
	wheel := taskwheel.NewHierarchicalTimingWheel[string](intervals, slots)

	// Create worker pool
	workerPool := make(chan *taskwheel.Timer[string], 1000)
	var wg sync.WaitGroup

	numWorkers := runtime.NumCPU() * 2
	for i := 0; i < numWorkers; i++ {
		wg.Add(1)
		go func() {
			defer wg.Done()
			for timer := range workerPool {
				// process timer
				processTask(timer)
			}
		}()
	}

	// start with batch callback
	stop := wheel.StartBatch(10*time.Millisecond, func(timers []*taskwheel.Timer[string]) {
		for _, t := range timers {
			workerPool <- t
		}
	})

	// schedule timers
	for i := 0; i < 10000; i++ {
		wheel.AfterTimeout(
			taskwheel.TimerID(fmt.Sprintf("task-%d", i)),
			fmt.Sprintf("Task %d", i),
			time.Duration(i)*time.Millisecond,
		)
	}

	time.Sleep(15 * time.Second)
	stop()
	close(workerPool)
	wg.Wait()
}

func processTask(timer *taskwheel.Timer[string]) {
	// business logic here
}

Performance Comparison

goos: darwin
goarch: arm64
pkg: github.com/ankur-anand/taskwheel
cpu: Apple M2 Pro
BenchmarkNativeTimers/1K_timers_100ms-10                             	      10	 102317362 ns/op	  136000 B/op	    2000 allocs/op
BenchmarkNativeTimers/10K_timers_100ms-10                            	      10	 113182400 ns/op	 1360000 B/op	   20000 allocs/op
BenchmarkNativeTimers/100K_timers_1s-10                              	       1	1093996708 ns/op	13600000 B/op	  200000 allocs/op
BenchmarkTimingWheelAfterTimeout/1K_timers_100ms-10                  	      10	 110006242 ns/op	  214488 B/op	    1056 allocs/op
BenchmarkTimingWheelAfterTimeout/10K_timers_100ms-10                 	      10	 110001250 ns/op	 1973784 B/op	   10181 allocs/op
BenchmarkTimingWheelAfterTimeout/100K_timers_100ms-10                	       9	 121230764 ns/op	18755629 B/op	  101093 allocs/op
BenchmarkMemoryComparison/Native_10K_timers-10                       	      10	 111245146 ns/op	 1336540 B/op	   20001 allocs/op
BenchmarkMemoryComparison/TimingWheel_10K_timers-10                  	      10	 109956992 ns/op	 1973784 B/op	   10181 allocs/op
BenchmarkTimingWheel_Memory/Timers_100000-10                         	      67	  19544636 ns/op	 9665561 B/op	  600001 allocs/op
BenchmarkTimingWheel_Memory/Timers_1000000-10                        	       6	 199790202 ns/op	96065897 B/op	 6000003 allocs/op
BenchmarkTimingWheel_Memory/Timers_10000000-10                       	       1	1807187459 ns/op	960067416 B/op	60000009 allocs/op
BenchmarkNativeTimer_Memory/Timers_100000-10                         	     140	  12572099 ns/op	15329470 B/op	  100001 allocs/op
BenchmarkNativeTimer_Memory/Timers_1000000-10                        	      13	 145825920 ns/op	151103566 B/op	 1000001 allocs/op
BenchmarkNativeTimer_Memory/Timers_10000000-10                       	       1	1309542667 ns/op	1705383936 B/op	10000002 allocs/op

Performance Comparison

Workload	Metric	NativeTimers	TimingWheel	Difference
1K timers (100ms)	Time/op	102 ms	110 ms	+8% slower
	Mem/op	136 KB	214 KB	+58% more
	Allocs/op	2.0 K	1.1 K	-47% fewer
10K timers (100ms)	Time/op	113 ms	110 ms	-3% faster
	Mem/op	1.36 MB	1.97 MB	+45% more
	Allocs/op	20.0 K	10.2 K	-49% fewer
100K timers (1s)	Time/op	1.09 s	0.12 s	-89% faster
	Mem/op	13.6 MB	18.8 MB	+38% more
	Allocs/op	200.0 K	101.1 K	-50% fewer

At very small scales (≈1K timers), TimingWheel shows a slight overhead (+8% slower, more memory).
But once you reach 10K+ timers, it matches or beats native performance, consistently cuts allocations ~50%, and at 100K timers it’s ~9× faster with half the allocations.

Advanced Usage

Custom Payload Types

type Task struct {
    UserID   string
    Action   string
    Priority int
}

wheel := taskwheel.NewHierarchicalTimingWheel[Task](intervals, slots)
wheel.AfterTimeout("task1", Task{
    UserID:   "user123",
    Action:   "send_email",
    Priority: 1,
}, 5*time.Second)

Priority-Based Processing

stop := wheel.StartBatch(10*time.Millisecond, func(timers []*taskwheel.Timer[Task]) {
    // Sort by priority
    sort.Slice(timers, func(i, j int) bool {
        return timers[i].Value.Priority > timers[j].Value.Priority
    })
    
    for _, t := range timers {
        processTask(t)
    }
})

License

MIT License - see LICENSE file for details

Credits

Inspired by: