GitHub - zerocool26/Quantum-Observability-Contract-Compilation-QOCC-

10 min read Original article ↗

License Python 3.11+

QOCC is a vendor-agnostic, reproducible, trace-first layer that instruments quantum program workflows end-to-end and supports contract-defined correctness + cost optimization via closed-loop compilation/search.

Features

  • Observability — OpenTelemetry-style traces for quantum compilation, simulation, mitigation, decoding, and execution
  • Contracts — Machine-checkable semantic constraints + explicit cost objectives
  • Closed-loop optimization — Generate candidate pipelines, score cheaply, validate with simulation, choose best under contracts
  • Trace Bundles — Portable "repro packages" that can be rerun, compared, regression-tested, and shared
  • Nondeterminism detection — Compile multiple times and verify output stability
  • Content-addressed caching — Cache compilation results keyed by circuit hash + pipeline spec + backend version
  • Plugin system — Register custom adapters and evaluators via Python entry points
  • Hardware execution interface — Optional adapter execute() API with structured ExecutionResult metadata
  • Mitigation pipeline stage — Optional mitigation stage with trace spans and overhead telemetry
  • Bundle signing & provenance — Ed25519 signing/verification for trace-bundle authenticity

Quick Start

pip install -e ".[all]"

# Run a trace
qocc trace run --adapter qiskit --input examples/ghz.qasm \
    --pipeline examples/pipeline_examples/qiskit_default.json --out bundle.zip

# Compare bundles
qocc trace compare bundleA.zip bundleB.zip --report reports/

# Check contracts
qocc contract check --bundle bundle.zip --contracts examples/contracts_examples.json

# Check contracts from DSL file
qocc contract check --bundle bundle.zip --contracts examples/contracts.qocc

# Contract evaluation cache controls
qocc contract check --bundle bundle.zip --contracts examples/contracts.qocc --max-cache-age-days 7

# Compilation search
qocc compile search --adapter qiskit --input examples/ghz.qasm --topk 5 --out search.zip

# Compilation search with Pareto multi-objective selection
qocc compile search --adapter qiskit --input examples/ghz.qasm --mode pareto --out search.zip

# Random search strategy
qocc compile search --adapter qiskit --input examples/ghz.qasm --strategy random --out search.zip

# Bayesian adaptive search (UCB acquisition)
qocc compile search --adapter qiskit --input examples/ghz.qasm --strategy bayesian --out search.zip

# Bayesian search with historical transfer-learning prior (30-day half-life)
qocc compile search --adapter qiskit --input examples/ghz.qasm --strategy bayesian --prior-half-life 30 --out search.zip

# Evolutionary search (tournament + crossover + mutation)
qocc compile search --adapter qiskit --input examples/ghz.qasm --strategy evolutionary --out search.zip

# Batch search over multiple circuits from a manifest
qocc compile batch --manifest examples/batch_manifest.json --workers 4 --out batch.zip

# Noise-aware surrogate scoring (provider-agnostic noise model JSON)
qocc compile search --adapter qiskit --input examples/ghz.qasm --noise-model examples/noise_model.json --out search.zip

# Detect nondeterminism (compile 5 times)
qocc trace run --adapter qiskit --input examples/ghz.qasm --repeat 5 --out nd.zip

# Replay a bundle
qocc trace replay bundle.zip --out replayed.zip

# Generate interactive HTML report from an existing bundle
qocc trace html --bundle bundle.zip --out report.html

# Generate HTML report directly during trace run
qocc trace run --adapter qiskit --input examples/ghz.qasm --html

# Optional notebook visualization dependencies
pip install -e ".[jupyter]"

# Optional IBM Quantum Runtime hardware execution support
pip install -e ".[ibm]"

# Run trace and auto-ingest into regression DB
qocc trace run --adapter qiskit --input examples/ghz.qasm --db

# Watch pending hardware jobs in a bundle and update results in-place
qocc trace watch --bundle bundle.zip --poll-interval 5 --timeout 300

# Trigger contract checks automatically after hardware completion
qocc trace watch --bundle bundle.zip --on-complete "qocc contract check --bundle {bundle} --contracts examples/contracts_examples.json"

# Regression DB workflows
qocc db ingest bundle.zip
qocc db query --adapter qiskit --since 2026-01-01
qocc db tag bundle.zip --tag baseline

# CI template workflows
# copy examples/ci/qocc_baseline.yml, examples/ci/qocc_benchmark.yml,
# and examples/ci/qocc_pr_check.yml into .github/workflows/

# Bundle signing (requires: pip install -e ".[signing]")
qocc bundle sign --key ./ed25519_private.pem bundle.zip
qocc bundle verify --key ./ed25519_public.pem bundle.zip

# Notebook helpers (inside Python/Jupyter)
python -c "import qocc; print(qocc.show_bundle('bundle.zip'))"

# Bootstrap project defaults (contracts, pipeline, CI workflow, tool.qocc config)
qocc init --yes --adapter qiskit

Architecture

Input Circuit → Adapter (ingest/normalize) → Compilation → Metrics → Contract Eval → Bundle Export
                                                ↓
                                        Trace Emitter (spans/events)
                                                ↓
                     Content-Addressed Cache ←→ Nondeterminism Detection

Every stage emits structured spans with per-pass granularity. The resulting Trace Bundle contains everything needed to reproduce, compare, and debug.

Contract Types

Type Evaluator Description
distribution TVD / chi-square / G-test Output distribution preserved within tolerance
observable Hoeffding CI Z-observable expectation preserved within ε
clifford Stabilizer tableau Exact Clifford equivalence (falls back to distribution for non-Clifford)
exact Statevector fidelity Exact statevector equivalence
cost Resource budget Depth, 2Q gates, total gates, duration, proxy error within limits
zne Richardson extrapolation Zero-noise extrapolated observable stays within tolerance of ideal

Contract DSL

qocc contract check --contracts supports both JSON and .qocc DSL files.

contract depth_budget:
    type: cost
    assert: depth <= 50
    assert: two_qubit_gates <= 100

contract tvd_check:
    type: distribution
    tolerance: tvd <= 0.05
    confidence: 0.99
    shots: 4096 .. 65536

contract parametric_budget:
    type: cost
    assert: depth <= input_depth - 2
    assert: proxy_error_score <= 1 - error_budget

Parametric values are resolved at evaluation time from bundle metrics and contract fields (for example input_depth, compiled_depth, baseline_tvd, and symbolic references like error_budget).

Contract Composition

Composition is supported via JSON envelopes:

{
    "name": "combined",
    "op": "all_of",
    "contracts": [
        {"name": "depth", "type": "cost", "resource_budget": {"max_depth": 50}},
        {"name": "dist", "type": "distribution", "tolerances": {"tvd": 0.05}}
    ]
}

Supported ops: all_of, any_of, best_effort, with_fallback.

  • best_effort records inner contract results but does not fail overall.
  • with_fallback switches to fallback when primary returns a NotImplementedError-class failure.

Contract Result Cache

Contract evaluation results are cached in the content-addressed cache to avoid re-running repeated checks. Cache key is derived from:

  • circuit_hash
  • contract_spec_hash
  • shots
  • seed

Use --max-cache-age-days to ignore stale cached contract results.

Hardware Counts in Contract Evaluation

When a bundle includes hardware execution payloads, check_contract() can consume real-device counts directly for sampling-style contract checks.

  • hardware.input_counts / hardware.baseline_counts are used as baseline counts.
  • hardware.counts (or hardware.result.counts) is used as compiled/output counts.
  • Hardware output counts take precedence over simulated compiled counts when both are present.

IBM Runtime Adapter

QOCC includes an ibm adapter with runtime hardware execution support through qiskit-ibm-runtime:

  • Optional execute() path for real-device job submission and polling
  • Required hardware spans: job_submit, queue_wait, job_complete, result_fetch
  • Polling events emitted as job_polling
  • Metadata includes job_id, provider/backend information, basis gates, coupling-map hash, and raw runtime result payload

Hardware Job Watch

qocc trace watch monitors pending hardware jobs recorded in hardware/pending_jobs.json, polls provider APIs, and updates bundle artifacts in place:

  • Writes per-job results to hardware/<job_id>_result.json
  • Maintains aggregate hardware payload in hardware/hardware.json
  • Appends completion spans to trace.jsonl
  • Supports timeout and optional automation via --on-complete

Contract Example

[
  {"name": "tvd-check", "type": "distribution", "tolerances": {"tvd": 0.1},
   "confidence": {"level": 0.95}, "resource_budget": {"n_bootstrap": 1000}},
  {"name": "depth-budget", "type": "cost", "tolerances": {"max_depth": 50}},
  {"name": "g-test", "type": "distribution", "spec": {"test": "g_test"},
   "confidence": {"level": 0.99}}
]

ZNE Contract

{
    "name": "zne_expectation",
    "type": "zne",
    "spec": {"noise_scale_factors": [1.0, 1.5, 2.0, 2.5]},
    "tolerances": {"abs_error": 0.05}
}

ZNE results include per-noise-level expectations plus Richardson extrapolation coefficients in contract details.

Mitigation Stage

You can add an optional mitigation stage directly in the pipeline spec.

{
    "adapter": "qiskit",
    "optimization_level": 2,
    "mitigation": {
        "method": "twirling",
        "params": {
            "shot_multiplier": 2.0,
            "runtime_multiplier": 1.25
        },
        "overhead_budget": {
            "max_runtime_multiplier": 2.0
        }
    }
}

Supported methods are adapter-configurable and include: twirling, pec, zne, and m3_readout.

When mitigation is enabled, QOCC emits a first-class mitigation span and records mitigation_shot_multiplier, mitigation_runtime_multiplier, and mitigation_overhead_factor in compiled candidate metrics.

Early Stopping (SPRT)

Set resource_budget.early_stopping: true with min_shots and max_shots to enable iterative sampling that halts when pass/fail is statistically certain. Uses a two-tier strategy: SPRT (Sequential Probability Ratio Test) for guaranteed Type I/II error bounds, with a CI-separation heuristic as fallback.

{"name": "adaptive", "type": "distribution", "tolerances": {"tvd": 0.1},
 "resource_budget": {"early_stopping": true, "min_shots": 256, "max_shots": 8192,
                      "sprt_beta": 0.1}}

Compilation Search

The search_compile() API and qocc compile search CLI implement the full closed-loop pipeline from §3 of the spec:

  1. Generate candidates by varying optimization level, seeds, and parameters
  2. Compile each candidate (with per-candidate caching)
  3. Score cheaply with a surrogate cost model
  4. Validate top-k candidates via simulation
  5. Evaluate contracts on validated candidates
  6. Select the best candidate (single-best or Pareto frontier)

Evolutionary Strategy

--strategy evolutionary runs a generation-based optimization loop over pipeline parameters using:

  • tournament parent selection
  • single-point crossover
  • Gaussian mutation in parameter-index space
  • elitism (carry-forward of top candidates)

One trace span is emitted per generation with attributes: generation, best_score, and population_diversity. Termination conditions include max generations, convergence by score standard deviation, or optional wall-clock budget.

Bayesian Historical Prior

Bayesian strategy persists scored observations to ~/.qocc/search_history.json and can warm-start new searches on the same adapter/backend version.

  • Prior weighting uses exponential age decay: weight = exp(-days_old / half_life)
  • Half-life is configurable from CLI via --prior-half-life (days)
  • Trace span bayesian_optimizer records prior_loaded and prior_size

Batch Search Mode

qocc compile batch runs search_compile() across a manifest of circuits and produces a batch bundle with:

  • per-circuit results in batch_results.json
  • cross-circuit summary table in cross_circuit_metrics.json
  • top-level batch trace span attributes: n_circuits, n_cache_hits, total_candidates_evaluated

CI Workflow Templates

Template GitHub Actions workflows are provided in examples/ci/:

  • qocc_baseline.yml — push/dispatch baseline trace + contract check + DB ingest
  • qocc_benchmark.yml — nightly/dispatch batch benchmark with summary table
  • qocc_pr_check.yml — PR/dispatch bundle diff and PR comment via gh

All templates include workflow_dispatch inputs for adapter, circuit_path, and contract_file.

Project Init Wizard

Use qocc init to bootstrap a repository with sensible defaults:

  • backend detection for qiskit/cirq/tket/stim
  • generated contracts/default_contracts.qocc
  • generated adapter-specific pipeline JSON under pipeline_examples/
  • generated .github/workflows/qocc_ci.yml
  • persisted defaults in pyproject.toml under [tool.qocc]

Examples:

qocc init --yes --adapter qiskit
qocc init --project-root ./my_quantum_project
qocc init --force --run-demo

Developer Documentation Site

QOCC now includes a MkDocs-compatible docs scaffold under docs/ with:

  • generated API reference (docs/api_reference.md)
  • tutorials for trace/contracts/regression debugging/custom adapters
  • architecture deep dives (trace model, bundle format, search pipeline)
  • contract and CLI reference pages

Regenerate API docs from docstrings:

python docs/generate_api_reference.py

Pareto Multi-Objective Selection

from qocc.api import search_compile

result = search_compile(
    adapter_name="qiskit",
    input_source="circuit.qasm",
    mode="pareto",  # multi-objective Pareto frontier
    contracts=[{"name": "tvd", "type": "distribution", "tolerances": {"tvd": 0.1}}],
)

OpenTelemetry Export

Export traces as OTLP-compatible JSON for ingestion by Jaeger, Grafana Tempo, Datadog, or any OpenTelemetry collector:

from qocc.trace.exporters import export_otlp_json, export_to_otel_sdk

# OTLP JSON file (works standalone)
export_otlp_json(spans, "traces.otlp.json", service_name="qocc")

# Bridge to OpenTelemetry Python SDK (when opentelemetry-sdk installed)
export_to_otel_sdk(spans)  # Spans appear in any configured OTel exporter

Caching

QOCC uses a content-addressed compilation cache keyed by SHA-256(circuit_hash || pipeline_dict || backend_version || extra) where extra can include search seed and noise model provenance hash. Cache hits are recorded in cache_index.json inside the bundle for reproducibility auditing. Cache hits now skip recompilation entirely by deserialising cached results.

from qocc.core.cache import CompilationCache
cache = CompilationCache()
print(cache.stats())  # {"size": 42, "hits": 10, "misses": 5}

Plugin System

Register custom adapters and contract evaluators via Python entry points:

# pyproject.toml
[project.entry-points."qocc.adapters"]
my_backend = "my_package:MyAdapter"

[project.entry-points."qocc.evaluators"]
my_evaluator = "my_package:my_eval_function"

Or register programmatically:

from qocc.adapters.base import register_adapter
from qocc.contracts.registry import register_evaluator

register_adapter("my_backend", MyAdapter)
register_evaluator("my_eval", my_eval_function)

Nondeterminism Detection

Run a circuit through the compilation pipeline multiple times and detect stochastic variation:

from qocc.api import run_trace

result = run_trace("qiskit", "circuit.qasm", repeat=5)
if result.get("nondeterminism", {}).get("reproducible") is False:
    print("WARNING: compilation is nondeterministic!")

Bundle Replay

Replay a previously recorded bundle to verify reproducibility:

qocc trace replay bundle.zip --out replayed.zip
qocc trace compare bundle.zip replayed.zip --report diff/

Supported Backends

Backend Status
Qiskit ✅ Full (per-stage spans, statevector sim)
Cirq ✅ Full (per-pass spans, statevector sim)
pytket ✅ Full (pass-sequence compile spans, deterministic JSON hash)
CUDA-Q 🔜 Optional
Stim/PyMatching ✅ QEC mode (DEM + decoder stats metadata)

Development

pip install -e ".[dev]"
pytest                 # ~303 tests
ruff check .
mypy qocc/

License

Apache 2.0 — see LICENSE.