In-memory, LSM-inspired, time-indexed multimap for Python.
Timelog stores many Python objects per timestamp, supports out-of-order ingest, and answers timestamp/range queries from a native C17 engine through a CPython extension. Current package version: 1.3.0.
Why Timelog
Timelog is built for timestamp-first workloads where the core operation is "everything in [t1, t2)".
It provides a native in-memory index with snapshot-consistent reads, out-of-order ingestion support, and sequenced range deletes.
At a high level, writes flow through mutable ingest state into immutable layers (memrun, L0, L1), while reads merge across layers with tombstone-aware filtering.
The design is LSM-inspired, but explicitly scoped to an embedded in-memory engine.
Use it when you want a local Python object index optimized for:
- append-heavy event streams,
- range scans over integer timestamps,
- retention via logical deletes/tombstones,
- concurrent snapshot readers over live Python objects,
- zero-copy timestamp views for analytics-style scans.
Installation
Install from PyPI:
Or with uv:
Distribution name is timelog-lib, import namespace stays timelog:
from timelog import Timelog
Runtime Support
- Regular CPython 3.12-3.14.
- Isolated subinterpreters with a per-interpreter GIL.
- Free-threaded CPython 3.14t (
Py_GIL_DISABLED=1) on the supported wheel set; importing Timelog does not re-enable the GIL. - Typed package metadata is included (
py.typedand_timelog.pyi).
The Python API remains single-writer at the instance level: writes and lifecycle operations must be externally serialized. Independent snapshot readers can run concurrently.
What Changed in 1.2 and 1.3
1.2.0 rebuilt the CPython runtime boundary: _timelog now uses
multi-phase module initialization, module-local exceptions and heap types,
per-interpreter-safe state recovery, and explicit synchronization for the
supported free-threaded wheel family.
1.3.0 keeps that runtime contract and focuses on the hot user paths:
auto-timestamp append(obj) moved from Python into C, common positional
methods use lower-overhead dispatch, bulk_append() ingests typed timestamp
buffers directly, and core lower/upper-bound searches use a measured
size-gated branchless path.
Quickstart: Streaming
from timelog import Timelog log = Timelog.for_streaming(time_unit="ms") # Auto-timestamp append log.append({"event": "boot"}) # Operator-style explicit timestamp append log[1_700_000_000_000] = {"event": "tick"} # Half-open range query [t1, t2) rows = list(log[1_700_000_000_000:1_700_000_000_001]) print(rows) log.close() # deterministic cleanup; finalizer cleanup is best-effort
Quickstart: Correctness Semantics
from timelog import Timelog log = Timelog(time_unit="ms") log[10] = "A" del log[5:15] # delete [5, 15) log[10] = "B" # later insert at same ts print(log[10]) # ['B'] print(list(log[0:20])) # [(10, 'B')] log.close() # optional explicit cleanup
Timelog uses sequenced tombstones, so later inserts are not hidden by earlier deletes.
Core Guarantees
- Time ranges are half-open:
[t1, t2). - Reads are snapshot-consistent.
- Concurrency model is single writer plus concurrent readers.
- Duplicate timestamps are allowed (multimap semantics).
- Write-path backpressure (
TimelogBusyError) indicates the write was accepted; do not blind-retry the same write. close()discards all data. Timelog is in-memory;flush()improves open-instance visibility for readers, not durability.
What Timelog Is (and Isn’t)
Timelog is:
- an embedded, in-memory timestamp index,
- optimized for append-heavy ingest and time-range retrieval,
- implemented in C17 with first-party CPython bindings.
Timelog is not:
- a durable storage engine,
- a distributed TSDB,
- a SQL query engine.
close() discards all data — the engine is in-memory, so nothing survives it.
flush() matters while the log is OPEN: it materializes pending writes into
immutable segments so zero-copy views() readers can see them.
API Snapshot
Core Python facade surface:
- Constructors:
Timelog(...),for_streaming(...),for_bulk_ingest(...),for_low_latency(...). - Writes:
append(obj),append(obj, ts=...),append(ts, obj).extend([(ts, obj), ...], mostly_ordered=..., insert_on_error=...).bulk_append(timestamps, objects)for contiguous native-endian int64 buffers plus a same-length list/tuple of payloads.log[ts] = obj,delete(t1, t2),delete(ts),cutoff(ts).
- Reads:
log[t1:t2],log[t1:],log[:t2],log[:].log[ts]/at(ts).- named iterators:
range,since,until,all,point/equal. - iterator helpers:
len(it),next_batch(n), andit.view().
- Introspection and views:
stats(),busy_events,extend_skipped,retired_queue_len.views(...)/page_spans(...)for zero-copy timestamp spans.PageSpan.timestampsis a read-only memoryview;PageSpan.objects()lazily exposes the corresponding Python payloads.
See docs/python-api.md for the full behavior contract.
Lifecycle, Threading, and Backpressure
- Most users should write
log = Timelog(...)or use a preset constructor and keep the object for the required scope. A context manager is available but not required. - Explicit
close()gives deterministic cleanup. If omitted, collection auto-closes on a best-effort basis. - Do not call
close()concurrently with other operations on the same instance. - Release active iterators,
PageSpanobjects, object views, and exported memoryviews before closing; they hold snapshot pins. - Background maintenance can run automatically (
maintenance="background") or be controlled manually (maintenance="disabled"+flush()/compact()/maint_step()). TimelogBusyErroron write operations means accepted write + pressure signal, not "write lost".
Architecture
Write Path Read Path
---------- ---------
append/extend/delete snapshot + query([t1, t2))
| |
v v
Memtable (mutable) <-------------------- Snapshot view
| seal
v
Memrun (immutable)
| flush
v
L0 Segments (overlap)
| compact
v
L1 Segments (windowed, non-overlap)
Reads plan sources across active + immutable layers, then run k-way merge with tombstone filtering based on sequencing/watermark state.
Flush and compaction bound read fan-out over time.
Deletes are logical tombstones; physical cleanup is deferred to maintenance.
flush() is a visibility operation, not durability: it publishes pending
writes into immutable in-memory segments so readers and zero-copy views() can
see them. close() always tears down the in-memory engine and discards all
records.
Performance at a Glance
Same-harness v1.3 A/B against the v1.2.0 wheel, Linux x86_64, pinned CPU,
CPython 3.13.12, median of 5:
| Operation | v1.2.0 | v1.3.0 | Change |
|---|---|---|---|
append(obj) |
513.9 ns | 117.1 ns | 4.39x faster |
append(ts, obj) |
352.1 ns | 103.9 ns | 3.39x faster |
append(obj, ts=...) |
364.7 ns | 109.6 ns | 3.33x faster |
point(ts) |
457.1 ns | 337.1 ns | 1.36x faster |
equal(ts) |
548.8 ns | 429.3 ns | 1.28x faster |
next_ts(ts) |
393.8 ns | 299.8 ns | 1.31x faster |
range(t1, t2) |
575.9 ns | 458.0 ns | 1.26x faster |
delete_range(t1, t2) |
18,059.6 ns | 13,289.3 ns | 1.36x faster |
delete_before(ts) |
109.7 ns | 80.8 ns | 1.36x faster |
New v1.3 ingest fast path:
bulk_append(np.int64 array, list): 113.3 ns/record on a 200k-record measured batch.- In that benchmark,
bulk_appendwas 2.23x faster than a post-v1.3 per-record append loop and 3.51x faster thanextend(zip(...)).
Search-path optimization:
- Size-gated branchless lower/upper-bound search measured 1.9x-5.0x faster at gated sizes up to 262,144 records, and falls back to the neutral path for very large arrays where it no longer wins.
Historical scale snapshot (2026-02-15, Linux x86_64, CPython 3.13.12,
dataset 11,550,000 rows):
- Batch ingest (
A2):191,105records/sec. - Full scan (
B4):18,088,679records/sec. - Append latency (
K1, background):p99 = 672 ns. - PageSpan iteration (
F1):1.48Btimestamps/sec on the timestamp-only span path.
Results are workload-, configuration-, and hardware-dependent.
The current publishable benchmark framing is docs/performance.md; older
reports are retained as historical snapshots.
Methodology and context:
docs/PERFORMANCE_METHODOLOGY.mddocs/performance.mddocs/benchmarks/bulk_append.mddocs/benchmarks/max_delta_segments.mddocs/BENCHMARK_1GB_7PCT_OOO_UNIX.mddocs/BENCHMARK_REPORT.md
Complexity claims should be interpreted with stated assumptions. In practice:
- append path is amortized O(1) at memtable layer,
- point/range behavior approaches logarithmic seek + linear output scan when source fan-out is bounded by maintenance,
- delete cost depends on tombstone interval state.
Documentation
- Index:
docs/index.md - Release notes:
docs/release-notes.md - Python API:
docs/python-api.md - Configuration:
docs/configuration.md - Error and retry semantics:
docs/errors-and-retry-semantics.md - Performance methodology:
docs/PERFORMANCE_METHODOLOGY.md - PyPI/release operations:
docs/pypi-release.md
License
MIT. See LICENSE.
Contributing
PRs are welcome. Run core validation locally:
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DTIMELOG_BUILD_PYTHON=ON -DTIMELOG_BUILD_PY_TESTS=ON cmake --build build --target timelog_e2e_build --config Release -j 2 ctest --test-dir build -C Release --output-on-failure -R '^py_.*_tests$' cmake -E env PYTHONPATH="$PWD/python" python -m pytest python/tests -q
Package build sanity:
python -m build
python -m twine check dist/*