shared incident tracing
Your team.
One causal truth.
Before the war room starts guessing.
Monitoring tools help you investigate. Incidentary helps your team converge first.
When an alert fires, teams don't lack dashboards. They lack agreement. Incidentary captures the pre-alert causal chain and delivers it as a shared replayable artifact — so the room starts from one picture, not five.
pre-alert causal trace · assembled in < 2s
trace · onboarding-quickstart · 1.8s
service / operationtimelinems
5 spans·1 error·pre-alert
root cause:
payment-svc→pg_pool exhaustion
1 artifact
shared by the whole room
60 sec
pre-alert window captured
No lock-in
Open-source capture layer
before / after
Same incident. Different first minute.
Without Incidentary
- Alert fires
- Responders open different tools
- Each person sees a different symptom
- Cause and fallout get confused
- One engineer synthesizes the story for everyone else
- 10 to 20 minutes spent aligning before real debugging begins
With Incidentary
- Alert fires with a direct link to the shared trace
- The room opens one artifact
- Everyone sees what broke first, how it propagated, and where coverage is missing
- Responders align in minutes, by shared evidence — not narration
- Datadog becomes the second step, not the first
how it works
Four steps. No black boxes.
01
instrument
Drop in the SDK. One middleware call wraps your HTTP handlers and propagates incident context automatically. No distributed config files. No sampling tuning. No OpenTelemetry collector to maintain.
import { incidentary } from '@incidentary/sdk-node';
app.use(incidentary.middleware());02
ingest
Spans, errors, and structured logs flush over a persistent gRPC stream. No sampling. No dropped events at the boundary. No gaps caused by buffer timeouts.
// spans flushed automatically // errors captured at boundary // logs correlated by trace-id
03
assemble
The correlator builds a causal graph in real time. When anomaly thresholds breach, the pre-arm ring buffer locks: the 60 seconds before the alert are already captured and attached to the incident artifact.
// pre-arm triggered at T-90s // root cause isolated T-20s // runbook linked T-8s
04
respond
The alert fires with a direct link. Your team opens one artifact — the complete causal trace, shared across the room. Not five dashboards. Not one senior engineer narrating to four others. One picture.
// alert fires // one shared trace lands in Slack // cause before dashboards, not after
solo mode
One SDK install.
Every dependency revealed.
Install the SDK on a single service. Incidentary observes every outbound call and surfaces uninstrumented dependencies as ghost services — services you depend on but don't have data from. One install reveals your entire dependency topology.
No teammates required. No config. The anomaly feed catches latency spikes and error bursts before they become incidents. The coverage scorecard shows you where to instrument next.
You get value in five minutes, not after the next outage.
payments
← calls
checkout-svc
checkout-svc
calls →
inventory
checkout-svc
calls →
shipping
service coverage1 / 4 instrumented
instrumentedghost service
1serviceSee what your service depends on
2–5servicesMap your topology as you instrument
6–15servicesCross-service incidents, clearly
15+servicesTeam convergence and shared traces
pre-alert window
The 60 seconds before the alert.
Usually reconstructed at 3am.
Now already waiting.
Signal correlators watch your telemetry streams continuously. The moment anomalies appear, pre-arm sequences begin — assembling the causal path, linking related events, and tagging the break before the alert fires.
By the time PagerDuty wakes your team, the causal prelude is already rendered. Not a guess. Not an AI summary. A deterministic trace built from what your services actually reported to each other.
T-90s
checkout-svc latency ↑
T-72s
payment-svc 5xx rate spike
T-55s
db-pool exhaustion detected
T-38s
trace assembly started
alert fires → context ready
82:00
avg MTTR for distributed incidents
↓< 1:30 to shared ground truth
mttr improvement
The war room used to start by figuring out what happened.
Now it starts by acting on what happened.
The typical war room spends the first 15-20 minutes just figuring out what happened — five engineers, five tools, five incomplete pictures. Incidentary collapses that convergence phase to under 90 seconds, because the trace is already assembled when the alert fires.
who it's for
Built for the teams who feel the pain of
distributed incidents.
just split the monolith?
You went from one service to three. Now incidents involve services you didn't even know called each other. Incidentary shows the causal chain across every boundary — before the war room starts guessing.
running distributed services?
When five engineers are looking at five dashboards, agreement takes longer than the fix. Incidentary delivers one shared artifact so the room converges before anyone opens a terminal.
expansion
The incident is the product demo.
One engineer shares a trace link in Slack. Teammates see the causal chain without installing anything. They notice the ghost service gaps — services where Incidentary knows a call was made but can't see inside. The product sells itself through its own gaps.
01one engineer installsSDK on one service, 3 minutes. Ghost services and the anomaly feed appear immediately.
02first incident sharedA trace link lands in Slack. Teammates see the causal chain — and the ghost service gaps.
03teammates instrumentGhost services become real services. The coverage scorecard tracks progress toward full visibility.
04team convergesEvery incident starts from one shared artifact. MTTR drops. The coverage scorecard turns green.
platform
Every library. Zero config. One causal chain.
auto-instrumentation
The SDK detects libraries in your dependency tree and patches them at startup. No manual span creation. No config files. If OpenTelemetry already patched a library, the SDK skips it.
nodeexpress · fastify · koa · pg · ioredis · bullmq · amqplib · kafkajs · grpc
pythonfastapi · flask · django · psycopg2 · asyncpg · celery · kombu
gogin · echo · chi
dotnetaspnetcore · httpclient · efcore · grpc · masstransit · lambda
25 libraries · 4 ecosystems · zero config required
database query capture
Query timing and connection metadata captured automatically. No parameters. No full query text. No sensitive data.
pg · ioredis · psycopg2 · asyncpg
queue instrumentation
Publish-consume pairs linked causally. Async workflows traced end-to-end without manual context propagation.
bullmq · amqplib · kafkajs · celery · kombu
grpcfull causal linkage · all sdks
opentelemetryzero-code ingest from collector
custom eventswebhooks · jobs · custom ops
rest api10K req/min · cursor pagination
integrations
Plugs into the tools your team already uses.
slack
notifications + slash commands
Incident URL posted automatically. /incidentary slash command to open traces inline.
pagerduty
incident url in timeline
Webhook fires on alert. Causal trace URL injected into PagerDuty incident timeline.
opsgenie
webhook triggers
Webhook integration triggers artifact assembly. Link back into OpsGenie alert.
kubernetes
cluster events + topology
Helm install in one command. Watches 14 resource types — OOM kills, crash loops, evictions, node pressure, HPA scaling, deploy rollouts. Populates service topology from workload annotations. No SDK required on the cluster.
opentelemetry
zero-code ingest from existing collector
Send existing OTel spans to Incidentary via OTLP. No SDK install needed. Coexists with Incidentary SDKs in the same trace.
shared links
no login · token-based · read-only
Paste in Slack, email, or Jira. Anyone with the link sees the trace. No account needed.
trust posture
privacy: data_boundary: metadata-only request_bodies: never captured query_parameters: never captured headers: never captured completeness: labels: full | partial | low topology_aware: true retention: windows: 14d | 30d | 90d deletion: hard delete at expiry pre_arm: signals: 5xx rate · slow success in-flight pileup · retry onset thresholds: configurable per service
quickstart
One middleware call.
No distributed config files. No sampling tuning. No OpenTelemetry collector to maintain. The SDK is a single middleware — it handles context propagation, event capture, and span flushing.
Your services keep running. Incidentary keeps watching.
checkout-svc/index.ts
import { incidentary } from '@incidentary/sdk-node';
import express from 'express';
const app = express();
// Wrap once — all routes instrumented
app.use(incidentary.middleware({
apiKey: process.env.INCIDENTARY_API_KEY,
serviceName: 'checkout-svc',
}));
app.post('/checkout', async (req, res) => {
// spans, errors, and slow queries captured automatically
const order = await processOrder(req.body);
res.json(order);
});get started
Start in minutes.
The SDKs are yours. The infrastructure is ours.
The capture SDKs are Apache 2.0 licensed. Read every line of source. Fork freely. No proprietary agent. No lock-in at the instrumentation layer.
Incidentary runs as a managed cloud service. No infrastructure to provision, no database cluster to operate, no retention policies to tune. Install the SDK, point it at your workspace, and the shared causal trace is there when the next alert fires.
First 20 teams get a direct Slack channel with the founder for feature requests and priority support.