GitHub - lynxbase/lynxdb: A lightweight schema-on-read analytics in a single binary

LynxDB is a single-binary log analytics database. It works as a Unix-style pipe tool, a persistent server, or a distributed cluster, all through the same LynxFlow query engine.

LynxDB is in active development. APIs, storage format, and query behavior may change between releases.

Why LynxDB

Pipe mode: run analytics on stdin or local files with no daemon.
Server mode: ingest logs once, query indexed columnar storage repeatedly.
LynxFlow: a clean pipeline language with typed values, schema-on-read parsing, CTEs, joins, materialized views, arrays, objects, and time-series sugar.
Index-honest search: has is term-index search, contains is substring search, and matches is regex.
Drop-in ingest paths: Elasticsearch _bulk, OpenTelemetry OTLP, Splunk HEC, syslog, and raw HTTP ingest.

Install

curl -fsSL https://lynxdb.org/install.sh | sh

Other options:

brew install lynxbase/tap/lynxdb
go install github.com/lynxbase/lynxdb/cmd/lynxdb@latest
docker run -p 3100:3100 ghcr.io/lynxbase/lynxdb server

Query Without A Server

Pipe any logs into lynxdb query and use the full LynxFlow engine in-process:

kubectl logs deploy/api | lynxdb query '
  where status >= 500
  | stats count() as errors, p95(duration_ms) as p95_ms by endpoint
  | sort -errors
  | head 10'

Query a local file directly:

lynxdb query --file access.log '
  from main status>=500
  | stats count() as count, dc(client_ip) as unique_ips by uri
  | sort -count
  | head 20'

Explore a large file cheaply:

lynxdb query --file app.ndjson '
  sample 1% seed=42
  | describe'

Run A Server

lynxdb server
lynxdb ingest nginx_access.log --source nginx

lynxdb query '
  from main[-1h] _source=nginx status>=500
  | every 5m by uri stats count() as errors fill=0
  | sort uri, _time'

from main[-1h] scopes the source and time range. Search terms immediately after from are source-level search sugar: status>=500, "connection reset", error, and field=* all desugar to typed LynxFlow predicates.

LynxFlow In 60 Seconds

LynxFlow v2 is the only query language in LynxDB. The old SPL2 runtime was removed; legacy spellings now produce migration hints.

from nginx[-24h] "timeout" status>=500
| parse json
| extend route = url_strip_query(uri),
         latency_bucket = bucket(duration_ms, [0, 50, 100, 250, 500, 1000])
| stats count() as count,
        p95(duration_ms) as p95_ms,
        top_k(client_ip, 5) as top_clients
  by service, route, latency_bucket
| sort -count
| head 20

Core stage names are intentionally explicit:

Old habit	LynxFlow v2
`eval x=...`	`extend x = ...`
`table a, b` / `fields a, b`	`keep a, b`
`stats count by host`	`stats count() by host`
`timechart count span=5m`	`every 5m stats count()`
`sort count desc`	`sort -count`
`head 10` / `tail 10`	unchanged

Useful idioms:

// Conditional aggregation
from main[-1h]
| stats count(where status >= 500) as errors,
        count() as total
  by service
| extend error_rate = errors * 100.0 / total
| sort -error_rate

// CTEs and joins
let $threats = from threat_feed | keep client_ip, threat_type;
let $failures = from auth[-24h] event="login_failed"
  | stats count() as failures by src_ip
  | rename src_ip as client_ip;

from $threats
| join type=inner on client_ip with $failures
| sort -failures

// Arrays inside one event
from traces[-1h]
| extend p95_span = array_reduce("p95", map(spans, s -> s.duration_ms)),
         slow_spans = array_count(spans, s -> s.duration_ms > 500)
| where slow_spans > 0
| keep _time, trace_id, service, p95_span, slow_spans

Features

LynxFlow v2 - one expression grammar, typed values, arrays/objects, lambdas, CTEs, joins, window functions, and visible sugar rewrites.
Full-text index - FST term dictionary, roaring bitmap postings, and bloom filters for segment skipping.
Columnar storage - custom .lsg segments with delta-varint timestamps, dictionary encoding, Gorilla XOR, and LZ4 compression.
Materialized views - stored partial aggregate states with automatic query rewrites and rollups.
Time-series helpers - every, gapfill, hist, latency, percentiles, streamstats, rank, dense_rank, ema, and delta.
Analytics stdlib - arg_max, top_k, value_counts, entropy, calendar functions, URL/IP helpers, JSON path helpers, and array reducers.
Operational modes - stdin/file mode, local server, Web UI, REST API, cluster mode, S3 tiering, syslog, and shipper-compatible ingest.
Sigma support - convert and run Sigma detections as LynxFlow queries; see docs/site/docs/sigma.

Comparison

	LynxDB	Splunk	Elasticsearch	Loki	ClickHouse
Deployment	Single binary	Standalone or distributed	Single node or cluster	Single binary or microservices	Single binary or cluster
Dependencies	None	-	JVM	Object storage in production	Keeper for replication
Query language	LynxFlow	SPL	Lucene DSL / ES\|QL	LogQL	SQL
Pipe mode	Yes	No	No	No	Yes
Schema	Schema-on-read	Schema-on-read	Schema-on-write	Labels + line	Schema-on-write
Full-text index	FST + bitmaps	tsidx	Lucene	Label index only	Token bloom filters
License	Apache 2.0	Commercial	ELv2 / AGPL	AGPL	Apache 2.0

CLI Map

lynxdb query <query>         run a LynxFlow query
lynxdb server                start the HTTP server and Web UI
lynxdb ingest <file>         ingest local files into a server
lynxdb tail <query>          live tail query results
lynxdb shell                 interactive REPL
lynxdb explain <query>       show the logical/physical query plan
lynxdb fields <query>        inspect fields for matching events
lynxdb mv create/list        manage materialized views
lynxdb config                inspect and edit configuration
lynxdb status                show server status
lynxdb demo                  generate sample data
lynxdb grammar               print the LynxFlow grammar/cookbook

Run lynxdb <command> --help or see docs/site/docs/cli/overview.md for the full command map.

Configuration

Zero config is required for pipe mode and local use. Server defaults are documented in docs/site/docs/configuration.

Common overrides:

lynxdb server --data-dir /var/lib/lynxdb --addr 0.0.0.0:3100
LYNXDB_SERVER=http://localhost:3100 lynxdb query 'from main | stats count()'
lynxdb config init

Documentation

Contributing

Contributor workflow and PR guidelines live in CONTRIBUTING.md.

Feedback

LynxDB would not exist without the projects that inspired it:

Splunk - for the pipe-first log analytics model that inspired LynxFlow.
ClickHouse - for showing how much analytical performance a focused engine can deliver.
VictoriaLogs - for proving that operational log storage can be simple and efficient.
grep, awk, sed, jq - for the Unix style of composable data tools.

Star History

License

Apache 2.0