Settings

Theme

Show HN: SLOK – SLO composition with traffic-weighted service chains in K8s

1 points by lep_qq 24 days ago · 0 comments · 2 min read


I've been building a Kubernetes operator for SLO management (SLOK) and just shipped a feature I haven't seen elsewhere: WEIGHTED_ROUTES composition.

The problem

Most SLO tools treat a user journey as a simple chain: if any service fails, the whole thing fails. But real traffic doesn't work that way. In a checkout flow, 90% of users might skip the coupon service entirely. If the coupon service has 99.5% availability, does that really pull your checkout SLO down to 99.5%? No — because most of your users never touch it.

The model

WEIGHTED_ROUTES lets you describe which percentage of traffic flows through which service chain. Each chain is an implicit AND (all services in the chain must succeed). The composed error rate is:

e_total = 1 - Σ( weight_i × Π(1 - e_j) ) For the checkout example (90% skip coupon, 10% use it):

e_total = 1 - ( 0.9 × (1 - e_base) × (1 - e_payments) + 0.1 × (1 - e_base) × (1 - e_coupon) × (1 - e_payments) ) SLOK translates this formula directly into Prometheus recording rules wired into the standard multi-window burn rate pipeline.

The YAML

kind: SLOComposition spec: target: 99.9 window: 30d objectives: - name: base ref: { name: checkout-base-slo } - name: payments ref: { name: payments-slo } - name: coupon ref: { name: coupon-slo } composition: type: WEIGHTED_ROUTES params: routes: - name: no-coupon weight: 0.9 chain: [base, payments] - name: with-coupon weight: 0.1 chain: [base, coupon, payments] alerting: burnRateAlerts: enabled: true The operator generates the PrometheusRules automatically. You get burn rate alerts on the composed SLO, not just on individual services.

Other things SLOK does

AND_MIN composition (worst-case across services) Built-in SLI templates for http-availability, http-latency, kubernetes-apiserver Automatic error budget tracking exposed in .status Event correlation: when a burn rate spike is detected, SLOK creates an SLOCorrelation resource listing recent Deployments, ConfigMap changes, and cluster events that may have caused it — with an optional LLM-enhanced summary (Llama 3.3 70B via Groq) WEIGHTED_ROUTES is alpha. Feedback on the API shape is welcome.

Repo: https://github.com/federicolepera/slok

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection