GitHub - runcycles/cycles-openclaw-budget-guard: Cycles budget and action guard for OpenClaw agents

25 min read Original article ↗

CI npm License Node TypeScript Coverage

OpenClaw plugin for budget-aware model and tool execution using Cycles.

Why use this plugin?

AI agents make autonomous decisions — calling models, invoking tools, retrying on failure — with no human in the loop. Without runtime enforcement, several things go wrong:

Runaway spend. A single agent stuck in a tool loop or retrying failed calls can burn through hundreds of dollars in minutes. Provider spending caps are account-wide and too coarse. Rate limits don't account for cost. In-app counters don't survive restarts or coordinate across concurrent agents.

Uncontrolled side-effects. An agent can send 100 emails, trigger 50 deployments, or call dangerous APIs with nothing to stop it. Cost limits alone don't help — some actions are consequential regardless of price.

Noisy neighbors. In multi-tenant or multi-user setups, one agent can consume the entire team or tenant budget, starving other users. Without per-user scoping, there's no isolation.

No session-level cost visibility. When an agent session ends, you have no idea what it spent, which tools it called most, or whether it was cost-efficient. Debugging cost overruns after the fact is painful.

Abrupt failure. When budget runs out, the agent crashes instead of adapting — switching to cheaper models, reducing output length, or disabling expensive tools.

This plugin addresses those failure modes by checking model and tool execution before it runs, then degrading or blocking when budget conditions require it. It also tracks session-level cost breakdowns, tool usage, and budget transitions for debugging and operations.

Beyond enforcement, the plugin monitors for problems as they develop:

  • Burn rate anomaly detection catches runaway tool loops — if spending spikes 3x above the session average, onBurnRateAnomaly fires immediately
  • Predictive exhaustion warnings estimate when budget will run out and fire onExhaustionForecast before it happens
  • Automatic retry with backoff on transient Cycles server errors (429/503) prevents spurious denials under load
  • Reservation heartbeat auto-extends long-running tool reservations so cost tracking doesn't silently break
  • Observability via metricsEmitter (Datadog, Prometheus, Grafana, OTLP) and opt-in session event logs

In typical OpenClaw setups, you can add enforcement without changing agent logic.

For deeper background, see Why Rate Limits Are Not Enough and Runaway Agents and Tool Loops.

Overview

A comprehensive OpenClaw plugin that integrates with a live Cycles server to enforce budget boundaries during agent execution. It hooks into the OpenClaw plugin lifecycle to:

  • Reserve budget for model and tool calls using the reserve → commit → release protocol
  • Downgrade models when budget is low (configurable fallback chains)
  • Block execution when budget is exhausted (fail-closed by default)
  • Inject budget hints into prompts so the model is budget-aware
  • Detect budget transitions and fire callbacks/webhooks on level changes
  • Control tool access with allowlists, blocklists, and per-tool call limits
  • Apply graceful degradation strategies when budget is low
  • Retry denied reservations and transient server errors with configurable backoff
  • Keep long-running tools alive with automatic reservation heartbeat
  • Detect anomalies — burn rate spikes and predictive exhaustion warnings
  • Emit metrics to Datadog, Prometheus, Grafana, or any OTLP-compatible backend
  • Record an event log of every budget decision for debugging and compliance
  • Report unconfigured tools so you know which tools are using default cost estimates
  • Support dry-run mode for testing without a live Cycles server
  • Track per-tool cost breakdowns and session analytics with model cost reconciliation
  • Support multi-currency budgets with per-tool/model overrides
  • Support budget pools/hierarchies via parent budget visibility

The plugin uses the runcycles TypeScript client to communicate with a Cycles server.

Important: Budget exhaustion is enforced fail-closed by default, but Cycles server connectivity failures are handled fail-open — the plugin assumes healthy budget and allows execution to continue. See Fail-Open Behavior for details.

Prerequisites

  • OpenClaw >= 0.1.0 with plugin support
  • Node.js >= 20.0.0
  • A running Cycles server with:
    • A base URL (e.g. http://localhost:7878)
    • An API key
    • A tenant configured with a budget scope

If you don't have a Cycles server yet, see the Cycles quickstart to set one up. Alternatively, use dry-run mode to test without a server.

To see budget enforcement in action before wiring up your own agent, run the Cycles Runaway Demo — it shows the exact failure mode this plugin prevents, with a live before/after comparison.

Quick Start

1. Install the plugin

openclaw plugins install @runcycles/openclaw-budget-guard

For local development:

openclaw plugins install -l ./cycles-openclaw-budget-guard

2. Enable the plugin

openclaw plugins enable openclaw-budget-guard

3. Add minimal configuration

Add the following to your OpenClaw config file (typically openclaw.json or openclaw.config.json):

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "cyclesBaseUrl": "http://localhost:7878",
          "cyclesApiKey": "cyc_your_api_key_here",
          "tenant": "my-org"
        }
      }
    }
  }
}

That's it — the plugin uses sensible defaults for everything else. The agent will now enforce budget limits on every run.

Need an API key? API keys are created via the Cycles Admin Server (port 7979). See the deployment guide to create one, or see API Key Management for details.

4. (Optional) Keep secrets out of config files

Use OpenClaw's env var interpolation to avoid hardcoding API keys:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "cyclesBaseUrl": "${CYCLES_BASE_URL}",
          "cyclesApiKey": "${CYCLES_API_KEY}",
          "tenant": "my-org"
        }
      }
    }
  }
}

Then set the env vars in your shell or CI:

export CYCLES_BASE_URL="http://localhost:7878"
export CYCLES_API_KEY="cyc_your_api_key_here"

5. (Optional) Try dry-run mode

To test without a live Cycles server:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "tenant": "my-org",
          "cyclesBaseUrl": "http://unused",
          "cyclesApiKey": "unused",
          "dryRun": true,
          "dryRunBudget": 100000000
        }
      }
    }
  }
}

6. Verify it's working

After restarting OpenClaw, check the logs for:

  Cycles Budget Guard for OpenClaw v0.7.10
  https://runcycles.io
  tenant: my-org
  cyclesBaseUrl: http://localhost:7878
  ...

Run your agent and look for budget activity:

[openclaw-budget-guard] before_model_resolve: model=anthropic/claude-sonnet-4-20250514 level=healthy

If you see this, the plugin is actively checking budget on every model and tool call.

Understanding the cost model

The plugin uses a simple model: every model call and tool call reserves a fixed cost from the budget.

Currency. The default is USD_MICROCENTS — 1 unit = $0.00001 (one hundred-thousandth of a dollar). So:

Amount (units) USD equivalent
100,000 $0.001 (0.1 cents)
1,000,000 $0.01 (1 cent)
10,000,000 $0.10 (10 cents)
100,000,000 $1.00

Example. With a $5 budget (500,000,000 units):

  • anthropic/claude-opus at 1,500,000/call = ~333 calls before exhaustion
  • anthropic/claude-sonnet at 300,000/call = ~1,666 calls
  • web_search at 500,000/call = ~1,000 calls
  • lowBudgetThreshold: 10000000 triggers model downgrade when $0.10 remains

Model names. OpenClaw passes model identifiers in provider/model format (e.g., openai/gpt-4o, anthropic/claude-sonnet-4-20250514). Your modelBaseCosts, modelFallbacks, and defaultModelName must use the same format — bare model names like gpt-4o won't match. The plugin automatically strips the provider prefix when returning modelOverride to OpenClaw, so you can use provider/model consistently in all config fields without double-prefixing issues.

Setting toolBaseCosts. Start with the default (100,000 units per call). After your first session, check the unconfiguredTools list in the session summary — it tells you which tools need explicit costs. For tools that call external APIs, estimate higher (500K-1M). For lightweight tools, estimate lower (10K-50K).

Full Configuration Example

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "enabled": true,
          "cyclesBaseUrl": "http://localhost:7878",
          "cyclesApiKey": "cyc_your_api_key_here",
          "tenant": "my-org",
          "budgetScope": { "app": "my-app" },
          "currency": "USD_MICROCENTS",
          "lowBudgetThreshold": 10000000,
          "exhaustedThreshold": 0,
          "modelFallbacks": {
            "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"],
            "openai/gpt-4o": "openai/gpt-4o-mini"
          },
          "modelBaseCosts": {
            "anthropic/claude-opus-4-20250514": 1500000,
            "anthropic/claude-sonnet-4-20250514": 300000,
            "openai/gpt-4o": 1000000,
            "openai/gpt-4o-mini": 100000
          },
          "toolBaseCosts": {
            "web_search": 500000,
            "code_execution": 1000000
          },
          "toolCallLimits": {
            "send_email": 10,
            "deploy": 3
          },
          "injectPromptBudgetHint": true,
          "maxPromptHintChars": 200,
          "failClosed": true,
          "logLevel": "info",
          "reservationTtlMs": 60000,
          "overagePolicy": "ALLOW_IF_AVAILABLE",
          "lowBudgetStrategies": ["downgrade_model"],
          "maxTokensWhenLow": 1024,
          "retryOnDeny": false,
          "dryRun": false
        }
      }
    }
  }
}

Config Presets

Common starting configurations for typical deployment scenarios.

Strict Enforcement

For production agents handling real spend. Blocks on exhaustion, downgrades models, caps tool calls:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "tenant": "my-org",
          "failClosed": true,
          "lowBudgetStrategies": ["downgrade_model", "disable_expensive_tools", "limit_remaining_calls"],
          "modelFallbacks": {
            "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"]
          },
          "modelBaseCosts": {
            "anthropic/claude-opus-4-20250514": 1500000,
            "anthropic/claude-sonnet-4-20250514": 300000,
            "anthropic/claude-haiku-4-5-20251001": 100000
          },
          "toolBaseCosts": {
            "web_search": 500000,
            "code_execution": 1000000
          },
          "toolCallLimits": {
            "send_email": 10,
            "deploy": 3
          },
          "maxRemainingCallsWhenLow": 5
        }
      }
    }
  }
}

Development / Testing

Dry-run mode with generous budget. No Cycles server needed:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "tenant": "dev",
          "cyclesBaseUrl": "http://unused",
          "cyclesApiKey": "unused",
          "dryRun": true,
          "dryRunBudget": 500000000,
          "logLevel": "debug"
        }
      }
    }
  }
}

Cost-Conscious

Aggressive cost savings. Low thresholds, model downgrade with token limits, expensive tools disabled early:

{
  "plugins": {
    "entries": {
      "openclaw-budget-guard": {
        "config": {
          "tenant": "my-org",
          "lowBudgetThreshold": 5000000,
          "exhaustedThreshold": 100000,
          "lowBudgetStrategies": ["downgrade_model", "reduce_max_tokens", "disable_expensive_tools"],
          "maxTokensWhenLow": 512,
          "expensiveToolThreshold": 200000,
          "modelFallbacks": {
            "anthropic/claude-opus-4-20250514": "anthropic/claude-haiku-4-5-20251001",
            "openai/gpt-4o": "openai/gpt-4o-mini"
          }
        }
      }
    }
  }
}

Configure for your use case

Most users only need 5-10 config properties. Start with what you need:

I just want to stop runaway agents (3 required fields only):

{ "tenant": "my-org", "cyclesBaseUrl": "...", "cyclesApiKey": "..." }

The defaults (failClosed: true, lowBudgetThreshold: 10000000) will block agents that exhaust their budget and warn when it gets low.

I want cost-aware model selection — add:

{
  "modelFallbacks": { "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"] },
  "modelBaseCosts": { "anthropic/claude-opus-4-20250514": 1500000, "anthropic/claude-sonnet-4-20250514": 300000, "anthropic/claude-haiku-4-5-20251001": 100000 }
}

I want to cap dangerous tool calls — add:

{ "toolCallLimits": { "send_email": 10, "deploy": 3, "delete_data": 1 } }

I want observability — add:

{ "otlpMetricsEndpoint": "http://localhost:4318/v1/metrics" }

I want to catch runaway loops — add:

{ "burnRateAlertThreshold": 3.0, "onBurnRateAnomaly": "..." }

I want full debugging — add:

{ "enableEventLog": true, "logLevel": "debug" }

Config Reference

Core Settings

Field Type Default Description
enabled boolean true Master switch — set to false to disable the plugin
cyclesBaseUrl string Cycles server URL (required)
cyclesApiKey string Cycles API key (required)
tenant string Cycles tenant identifier (required)
budgetScope object Scope segments for targeting a specific budget (e.g. { "workspace": "road", "app": "lane" }). See Budget Scoping.
budgetId string Deprecated — use budgetScope instead. Equivalent to budgetScope: { "app": "<value>" }.
currency string USD_MICROCENTS Default budget unit for all reservations
failClosed boolean true Block model calls when budget is exhausted or reservation is denied (false = warn, allow, and track cost locally). See failClosed behavior.
logLevel string info debug / info / warn / error

Budget Thresholds

Field Type Default Description
lowBudgetThreshold number 10000000 Remaining budget at or below this triggers "low" mode
exhaustedThreshold number 0 Remaining budget at or below this triggers "exhausted" mode

Note: exhaustedThreshold must be strictly less than lowBudgetThreshold.

Model Configuration

Field Type Default Description
modelFallbacks object {} Map: model → fallback model or chain of fallbacks (string or string[])
modelBaseCosts object {} Map: model name → estimated cost per call
defaultModelCost number 500000 Fallback cost when a model isn't in modelBaseCosts
defaultModelActionKind string llm.completion Action kind for model reservations
modelCurrency string Override currency for model reservations (defaults to currency)

Tool Configuration

Field Type Default Description
toolBaseCosts object {} Map: tool name → estimated cost per call
defaultToolActionKindPrefix string tool. Prefix for tool action kinds (e.g. tool.web_search)
toolAllowlist string[] Only these tools are permitted (supports * wildcards)
toolBlocklist string[] These tools are blocked (supports * wildcards, takes precedence over allowlist)
toolCurrencies object Map: tool name → currency override
toolReservationTtls object Map: tool name → TTL override in ms
toolOveragePolicies object Map: tool name → overage policy override
toolCallLimits object Map: tool name → max invocations per session (e.g. {"send_email": 10})

Prompt Hints

Field Type Default Description
injectPromptBudgetHint boolean true Inject budget status into the system prompt
maxPromptHintChars number 200 Max characters for the injected budget hint

Reservation Settings

Field Type Default Description
reservationTtlMs number 60000 Default TTL for tool reservations (ms)
overagePolicy string ALLOW_IF_AVAILABLE Default overage policy (REJECT, ALLOW_IF_AVAILABLE, ALLOW_WITH_OVERDRAFT)
snapshotCacheTtlMs number 5000 How long to cache budget snapshots (ms)

Low Budget Strategies

When budget drops below lowBudgetThreshold, the plugin applies degradation strategies to reduce spend. Strategies only activate when explicitly listed in lowBudgetStrategies. The default is ["downgrade_model"].

Field Type Default Description
lowBudgetStrategies string[] ["downgrade_model"] Strategies to apply when budget is low. Each strategy below only takes effect when listed here.
maxTokensWhenLow number 1024 Token limit hint (requires "reduce_max_tokens" in lowBudgetStrategies)
expensiveToolThreshold number Cost threshold (requires "disable_expensive_tools" in lowBudgetStrategies)
maxRemainingCallsWhenLow number 10 Max calls allowed (requires "limit_remaining_calls" in lowBudgetStrategies)

downgrade_model — Switch to cheaper models when budget is low. Requires modelFallbacks to define the fallback chain. The plugin tries each candidate in order and picks the first one whose cost (from modelBaseCosts) fits within the remaining budget. If no candidate fits, the original model is used.

{
  "lowBudgetStrategies": ["downgrade_model"],
  "modelFallbacks": {
    "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"]
  },
  "modelBaseCosts": {
    "anthropic/claude-opus-4-20250514": 1500000,
    "anthropic/claude-sonnet-4-20250514": 300000,
    "anthropic/claude-haiku-4-5-20251001": 100000
  }
}

reduce_max_tokens — Append a token limit instruction to the system prompt hint (e.g., "Limit responses to 512 tokens"). This is advisory — the LLM may not obey it. Does not enforce a hard token cap at the API level.

{
  "lowBudgetStrategies": ["downgrade_model", "reduce_max_tokens"],
  "maxTokensWhenLow": 512
}

disable_expensive_tools — Block tools whose estimated cost exceeds a threshold. The threshold defaults to lowBudgetThreshold / 10 if not set explicitly. Tools are always hard-blocked (regardless of failClosed).

{
  "lowBudgetStrategies": ["downgrade_model", "disable_expensive_tools"],
  "expensiveToolThreshold": 200000,
  "toolBaseCosts": {
    "web_search": 500000,
    "code_execution": 1000000,
    "read_file": 50000
  }
}

In this example, web_search (500K) and code_execution (1M) would be blocked when budget is low, but read_file (50K) would still be allowed.

limit_remaining_calls — Cap the total number of model + tool calls allowed while budget is low. Both model and tool calls decrement a shared counter. When the counter reaches zero, models respect failClosed (block or warn) while tools are always blocked.

{
  "lowBudgetStrategies": ["downgrade_model", "limit_remaining_calls"],
  "maxRemainingCallsWhenLow": 5
}

Important: Each strategy's config parameters (e.g., maxTokensWhenLow, expensiveToolThreshold, maxRemainingCallsWhenLow) are silently ignored unless the corresponding strategy is listed in lowBudgetStrategies. The plugin warns at startup if it detects this misconfiguration.

Strategies can be combined. They run in different hooks:

  • Model calls (before_model_resolve): downgrade_modellimit_remaining_calls
  • Tool calls (before_tool_call): disable_expensive_toolslimit_remaining_calls
  • Prompt build (before_prompt_build): reduce_max_tokens

Within each hook, an earlier strategy that blocks prevents later strategies from running.

A typical production config uses all four:

{
  "lowBudgetStrategies": ["downgrade_model", "reduce_max_tokens", "disable_expensive_tools", "limit_remaining_calls"],
  "maxTokensWhenLow": 512,
  "expensiveToolThreshold": 200000,
  "maxRemainingCallsWhenLow": 5
}

Retry on Deny

Field Type Default Description
retryOnDeny boolean false Retry tool reservations after denial
retryDelayMs number 2000 Delay between retries (ms)
maxRetries number 1 Maximum retry attempts

Dry-Run Mode

Field Type Default Description
dryRun boolean false Use in-memory simulated budget (no Cycles server needed)
dryRunBudget number 100000000 Starting budget for dry-run mode

Cost Estimation

Field Type Default Description
costEstimator function Custom callback (context) => number | undefined for dynamic tool cost estimation

The costEstimator receives a context object with toolName, durationMs, estimate, and result and should return the actual cost or undefined to use the estimate.

Budget Transitions

Field Type Default Description
onBudgetTransition function Callback fired when budget level changes (e.g. healthy → low)
budgetTransitionWebhookUrl string POST webhook URL for budget level transitions

Per-User/Session Scoping

Field Type Default Description
userId string User ID for budget scoping (can be overridden via ctx.metadata.userId)
sessionId string Session ID for budget scoping (can be overridden via ctx.metadata.sessionId)

Session Analytics

Field Type Default Description
onSessionEnd function Callback with session summary at agent end
analyticsWebhookUrl string POST webhook URL for session summary data

Budget Scoping with budgetScope

By default, the plugin tracks all spend against the tenant-level budget. If you run multiple agents or applications under the same tenant, they share one budget pool — one agent can consume the entire budget and starve others.

budgetScope targets a specific budget in the Cycles scope hierarchy. It supports any combination of scope levels (workspace, app, workflow, agent, toolset):

tenant: "my-org"                        ← shared across all apps
  workspace: "team-a"                   ← team-level isolation
    app: "research-agent"               ← $5 budget
    app: "coding-agent"                 ← $10 budget

Step 1: Create and fund the budget in Cycles (via the Admin API):

curl -X POST "http://localhost:7979/v1/admin/budgets/fund?scope=tenant:my-org/workspace:team-a/app:research-agent&unit=USD_MICROCENTS" \
  -H "X-Cycles-API-Key: your-admin-key" \
  -H "Content-Type: application/json" \
  -d '{
    "operation": "CREDIT",
    "amount": 500000000,
    "idempotency_key": "fund-research-agent-001"
  }'

This creates the scope under tenant:my-org and funds it with 500,000,000 units ($5.00). The scope is created automatically if it doesn't exist.

Step 2: Set budgetScope in the plugin config:

{
  "tenant": "my-org",
  "budgetScope": {
    "workspace": "team-a",
    "app": "research-agent"
  }
}

The plugin then:

  • Queries balances filtered to workspace: "team-a", app: "research-agent"
  • Creates reservations scoped to all specified segments
  • Reports spend against that specific budget, not the tenant total

For simple app-only scoping, use just the app key:

{
  "tenant": "my-org",
  "budgetScope": { "app": "research-agent" }
}

Migration from budgetId: budgetId is deprecated but still works. "budgetId": "research-agent" is equivalent to "budgetScope": { "app": "research-agent" }. If both are set, budgetScope takes precedence.

When to use budgetScope:

  • Multiple agents under the same tenant that need isolated budgets
  • Per-project or per-team spend tracking
  • Budgets with intermediate scope levels (workspace, workflow, etc.)
  • Preventing one agent from consuming the entire tenant budget

When to skip it:

  • Single agent setup — tenant-level budget is sufficient
  • You want all agents to share a single budget pool

Cycles scope hierarchy and what the plugin supports:

The Cycles protocol supports a full scope hierarchy: tenantworkspaceappworkflowagenttoolset. The plugin supports all levels via budgetScope:

Cycles scope Plugin config Used for
tenant tenant (required) Top-level budget boundary
workspace, app, workflow, agent, toolset budgetScope (optional) Budget isolation at any scope level
dimensions.user userId Per-user spend tracking within a scope
dimensions.session sessionId Per-session spend tracking within a scope

Budget Pools (Team Visibility)

Field Type Default Description
parentBudgetId string Parent budget scope — when set, the team/pool balance is included in prompt hints

parentBudgetId is a read-only visibility feature. When set, the plugin fetches the parent scope's balance and includes it in the prompt hint (e.g., "Team pool: 50000000 remaining"). It does not enforce the parent budget — enforcement happens at the scoped level via budgetScope.

Model Cost Reconciliation (v0.5.0)

Field Type Default Description
modelCostEstimator function Callback `(ctx: { model, estimatedCost, turnIndex }) => number

Observability (v0.5.0)

Field Type Default Description
metricsEmitter object Object with gauge/counter/histogram methods and optional flush() for observability pipeline integration. flush() is called at agent_end to ensure buffered metrics are sent.
aggressiveCacheInvalidation boolean true Proactively refetch budget snapshot after every commit/release for fresher data
otlpMetricsEndpoint string OTLP HTTP endpoint for auto metrics export (e.g. http://localhost:4318/v1/metrics)
otlpMetricsHeaders object Custom HTTP headers for OTLP requests

Resilience (v0.6.0)

Field Type Default Description
heartbeatIntervalMs number 30000 Interval for auto-extending long-running tool reservations (ms). Set 0 to disable.
retryableStatusCodes number[] [429, 503, 504] HTTP status codes that trigger automatic retry with exponential backoff
transientRetryMaxAttempts number 2 Max retry attempts for transient Cycles server errors
transientRetryBaseDelayMs number 500 Base delay for exponential backoff on retries (ms)

Anomaly Detection (v0.6.0)

Field Type Default Description
burnRateWindowMs number 60000 Time window for burn rate anomaly detection (ms)
burnRateAlertThreshold number 3.0 Alert when current window burn rate exceeds this multiple of the previous window
onBurnRateAnomaly function Callback (event: BurnRateAnomalyEvent) => void on burn rate spike
exhaustionWarningThresholdMs number 120000 Warn when estimated time-to-exhaustion drops below this (ms)
onExhaustionForecast function Callback (event: ExhaustionForecastEvent) => void on exhaustion forecast

Debugging (v0.6.0)

Field Type Default Description
enableEventLog boolean false Record every reserve/commit/deny/block decision in sessionSummary.eventLog

Function-Type Config — Not Available in OpenClaw

The config reference includes several function-type parameters (costEstimator, modelCostEstimator, onBudgetTransition, onSessionEnd, onBurnRateAnomaly, onExhaustionForecast, metricsEmitter). These cannot be used with OpenClaw. OpenClaw plugins are configured via JSON only — there is no mechanism to pass JavaScript functions.

Use these JSON-configurable alternatives instead:

Instead of... Use... How it works
costEstimator toolBaseCosts Fixed cost per tool. Tune estimates using session summary data.
modelCostEstimator modelBaseCosts Fixed cost per model.
onBudgetTransition budgetTransitionWebhookUrl Sends HTTP POST with level change event to your endpoint.
onSessionEnd analyticsWebhookUrl Sends HTTP POST with full session summary to your endpoint.
onBurnRateAnomaly otlpMetricsEndpoint Emits cycles.budget.burn_rate_anomaly counter to your OTLP backend.
onExhaustionForecast otlpMetricsEndpoint Emits cycles.budget.exhaustion_forecast_ms gauge to your OTLP backend.
metricsEmitter otlpMetricsEndpoint Auto-creates an OTLP emitter — no custom code needed.

Tuning cost estimates without a callback:

  1. Start with rough values in toolBaseCosts / modelBaseCosts (or use defaults)
  2. Set enableEventLog: true in your config
  3. Run a few agent sessions
  4. Check the session summary in the logs — it shows per-tool and per-model cost breakdowns, plus an unconfiguredTools list of tools using the default estimate
  5. Adjust your cost values based on actual usage patterns and re-run

This iterative approach is more practical than writing a cost estimator function, since the estimates only need to be "close enough" — Cycles reservations lock the estimated amount and commits charge the actual.

Why do the function params exist? The plugin is also published as an npm package. The function API is available for developers who import the plugin as a library in custom agent frameworks or test harnesses — not for standard OpenClaw JSON config.

How It Works

Budget Levels

Level Condition What Happens
healthy remaining > lowBudgetThreshold Pass through — no intervention
low exhaustedThreshold < remaining <= lowBudgetThreshold Apply low-budget strategies, inject warnings
exhausted remaining <= exhaustedThreshold Block execution (failClosed=true) or warn + track locally (failClosed=false)

Hook: before_model_resolve

Fetches budget state and reserves budget for the model call. The reservation is held open and committed later (in before_prompt_build or at agent_end), allowing the optional modelCostEstimator callback to reconcile estimated vs actual costs. When budget is low:

  • Applies model fallbacks (including chained fallbacks like opus → [sonnet, haiku])
  • Enforces limit_remaining_calls if configured
  • Attaches budget status metadata to ctx.metadata["openclaw-budget-guard-status"]

When budget is exhausted and failClosed=true, the plugin blocks the model call by overriding the model name to __cycles_budget_exhausted__, which causes the LLM provider to reject the request. The user sees "Unknown model: openai/cycles_budget_exhausted" — this is intentional. OpenClaw's before_model_resolve hook does not support { block: true } like before_tool_call does (feature request), so this workaround is the only way to prevent model execution when budget runs out.

Hook: before_prompt_build

Commits any pending model reservation from the previous turn (with modelCostEstimator reconciliation if configured). When injectPromptBudgetHint is enabled, injects a system context hint with:

  • Current remaining balance and percentage
  • Budget level warnings
  • Forecast projections (estimated remaining tool/model calls based on average costs)
  • Team pool balance (when parentBudgetId is configured)
  • Token limit guidance (when reduce_max_tokens strategy is active)

Example hint:

Budget: 5000000 USD_MICROCENTS remaining. Budget is low — prefer cheaper models and avoid expensive tools. 50% of budget remaining. Est. ~10 tool calls and ~5 model calls remaining at current rate. Team pool: 50000000 remaining.

Hook: before_tool_call

  1. Checks tool permissions against allowlist/blocklist
  2. Applies disable_expensive_tools and limit_remaining_calls strategies
  3. Creates a Cycles reservation with configured TTL, overage policy, and currency
  4. On denial, optionally retries (when retryOnDeny=true)
  5. Blocks or allows based on the reservation decision

Hook: after_tool_call

Commits the reservation with actual cost. Uses the costEstimator callback if configured, otherwise uses the original estimate. Tracks per-tool cost breakdowns for the session summary.

Hook: agent_end

  1. Releases orphaned reservations (defensive cleanup)
  2. Fetches final budget state
  3. Builds session summary with cost breakdown, forecasts, and timing
  4. Calls onSessionEnd callback and fires analytics webhook if configured
  5. Attaches summary to ctx.metadata["openclaw-budget-guard"]

Chained Model Fallbacks

Model fallbacks support both single values and ordered chains:

{
  "modelFallbacks": {
    "anthropic/claude-opus-4-20250514": ["anthropic/claude-sonnet-4-20250514", "anthropic/claude-haiku-4-5-20251001"],
    "openai/gpt-4o": "openai/gpt-4o-mini"
  }
}

When budget is low, the plugin tries each candidate in order and selects the first one whose cost fits within the remaining budget.

Tool Allowlists and Blocklists

Control which tools can be called using glob-style patterns:

{
  "toolAllowlist": ["web_search", "code_*"],
  "toolBlocklist": ["dangerous_*"]
}
  • Blocklist takes precedence over allowlist
  • Supports exact names and * wildcards (prefix: code_*, suffix: *_tool, all: *)

Tool Call Limits

Cap the number of times a specific tool can be invoked per session. Useful for consequential actions like sending emails or triggering deployments:

{
  "toolCallLimits": {
    "send_email": 10,
    "deploy": 3
  }
}

Once a tool reaches its limit, further calls are blocked with a descriptive reason. Tools without a limit are unrestricted. Limits reset on each new agent session.

Budget Transition Alerts

Configure callbacks or webhooks to be notified when budget level changes:

{
  "budgetTransitionWebhookUrl": "https://hooks.example.com/budget-alert"
}

Or programmatically:

{
  onBudgetTransition: (event) => {
    console.log(`Budget changed: ${event.previousLevel}${event.currentLevel}`);
  }
}

Error Handling

The plugin exports two structured error types:

import { BudgetExhaustedError, ToolBudgetDeniedError } from "@runcycles/openclaw-budget-guard";
  • BudgetExhaustedError (code: "BUDGET_EXHAUSTED") — thrown when budget is exhausted and failClosed=true. Includes remaining, tenant, and budgetId properties. The error message includes an actionable hint to increase budget via the Cycles API.
  • ToolBudgetDeniedError (code: "TOOL_BUDGET_DENIED") — available as a structured error type for tool denials. Includes toolName property.

failClosed — Block vs. Allow on Budget Denial

The failClosed setting (default: true) controls what happens when a model reservation is denied — either because the budget is exhausted or because the Cycles server rejects the reservation (e.g., the estimated cost exceeds remaining budget).

failClosed: true — The plugin blocks the model call. It returns a synthetic model override (__cycles_budget_exhausted__) that causes the LLM provider to reject the request. The agent stops. Use this in production when overspend is unacceptable.

failClosed: false — The plugin logs a warning and allows the model call to proceed. The estimated cost is tracked locally (session summary, cost breakdown, forecasting) even though no server-side reservation was committed. Use this for shadow/monitoring mode — you see what would have been blocked without disrupting the agent.

Scenario failClosed: true failClosed: false
Budget exhausted (cached snapshot) Block Warn + allow
Server denies reservation (estimate > remaining) Block Warn + allow + track cost locally
Low-budget call limit reached (model) Block Warn + allow
Low-budget call limit reached (tool) Always block Always block
Expensive tool threshold exceeded Always block Always block
Tool reservation denied Always block Always block

Note: All tool-level enforcement (reservation denials, call limits, expensive tool threshold) always blocks regardless of failClosed — tools have no fallback mechanism. failClosed only affects model-level decisions.

Fail-Open Behavior (Network Errors)

Separately from failClosed, the plugin handles network/transient errors with a fail-open strategy:

  • If the Cycles server is unreachable, the plugin assumes healthy budget and allows execution
  • If a commit fails, execution continues (logged but non-blocking)

This is always fail-open regardless of failClosed — a transient network blip should not kill every agent. failClosed only controls behavior when the server confirms the budget is insufficient.

Troubleshooting

"Skipping registration" warning during install

  • This is normal. OpenClaw loads the plugin during install before your config is written. The plugin detects the missing config, logs a warning, and skips registration. After you add your config and restart the gateway, the plugin will register normally.

Plugin not loading

  • Verify the plugin is enabled: openclaw plugins list
  • Check that openclaw.plugin.json is included in the installed package

"Unknown model: openai/__cycles_budget_exhausted__" or "Budget exhausted"

Your budget has run out. To resume:

  1. Fund the budget via the Cycles Admin API:

    curl -X POST "http://localhost:7979/v1/admin/budgets/fund?scope=tenant:my-org&unit=USD_MICROCENTS" \
      -H "X-Cycles-API-Key: your-admin-key" \
      -H "Content-Type: application/json" \
      -d '{"operation": "CREDIT", "amount": 50000000, "idempotency_key": "topup-001"}'

    This adds 50,000,000 units ($0.50) to the budget. Adjust the scope to match your tenant and budgetScope.

  2. Start a new agent session — the plugin fetches fresh budget state at the start of each session.

For details on budget management, see Budget Allocation and Management.

"cyclesBaseUrl is required" error

  • Set cyclesBaseUrl in your plugin config (use "${CYCLES_BASE_URL}" for env var interpolation)

Budget always shows "healthy"

  • Verify currency, tenant, and budgetScope match your Cycles setup
  • Set logLevel: "debug" to see raw balance responses

Tools not being blocked

  • Check toolBaseCosts includes your tool (default cost is 100,000 units)
  • Check failClosed is true (default)

Model not being downgraded

  • The exact model name must match a key in modelFallbacks
  • Check model costs in modelBaseCosts — fallback must be cheaper than remaining budget

Production checklist

Before deploying to production:

  • API key stored as env var (CYCLES_API_KEY), not in config file
  • failClosed: true (default — blocks on exhausted budget)
  • dryRun: false (default — uses real Cycles server)
  • modelBaseCosts set for each model your agent uses
  • toolBaseCosts set for at least your top 5 tools by usage
  • toolCallLimits set for dangerous tools (send_email, deploy, etc.)
  • lowBudgetThreshold calibrated for your session duration (default 10M = $0.10)
  • Budget transition monitoring via onBudgetTransition callback or budgetTransitionWebhookUrl
  • Session analytics via onSessionEnd callback or analyticsWebhookUrl
  • Run one test session with logLevel: "debug" and enableEventLog: true to verify costs

Known Limitations

Limitation Impact Workaround
Model cost is estimated by default. OpenClaw has no after_model_resolve hook, so model costs are based on modelBaseCosts estimates. The modelCostEstimator callback can reconcile costs if you have a proxy or gateway with token counts. Cost tracking for models is approximate unless you provide a modelCostEstimator. The plugin will never overspend — it may under-track slightly. Use modelCostEstimator to reconcile costs. Or buffer modelBaseCosts estimates 10–20% higher than expected.
ALLOW_WITH_CAPS decisions are not enforced. If the Cycles server returns caps (max_tokens, tool allowlist) alongside an ALLOW decision, the plugin stores them but does not apply them downstream. Low risk — v0 Cycles servers rarely return caps. Monitor Cycles protocol updates.
Per-user/session scoping uses custom dimensions. User and session IDs are passed as dimensions.user / dimensions.session in the reservation subject. v0 Cycles servers may ignore custom dimensions for balance filtering. Per-user budget isolation depends on server support for dimensions. Verify scoping works with your Cycles server version before relying on it in production.
Heartbeat requires client support. Reservation auto-extension (heartbeatIntervalMs) calls client.extendReservation(). If the Cycles client does not implement this method, heartbeats are silently skipped. Long-running tools may still lose cost tracking if the client lacks extendReservation. Use per-tool TTL overrides via toolReservationTtls as fallback.
Model blocking uses a provider-error workaround. OpenClaw's before_model_resolve hook does not support { block: true } (feature request). When budget is exhausted, the plugin overrides the model to __cycles_budget_exhausted__, causing the provider to reject the call. The user sees "Unknown model" instead of a clean budget error. Model calls are effectively blocked, but the error message is a provider error rather than a budget message. Tool blocking via before_tool_call works cleanly with { block: true }. Pending OpenClaw adding block support to before_model_resolve.
OpenClaw does not pass model name in hook events. The before_model_resolve event only contains { prompt } — no model name (feature request). The plugin auto-detects the model from system config or falls back to defaultModelName. Model-specific cost tracking requires defaultModelName to be set in plugin config. Set defaultModelName to your agent's model (e.g. "openai/gpt-5-nano").

For project structure, architecture diagrams, and development workflow, see ARCHITECTURE.md.

Documentation

License

Apache-2.0