GitHub - joseprupi/quantraserver: Distributed QuantLib

4 min read Original article ↗

Quantra

Quantra is a QuantLib-based pricing service built for parallel execution. It exposes pricing functionality over gRPC with FlatBuffers and through an HTTP/JSON gateway for easier integration and generated OpenAPI documentation.

Why This Exists

QuantLib is powerful, but it is not naturally suited to high-concurrency service workloads because important state such as Settings::instance().evaluationDate() is global to the process. Quantra works around that by running multiple isolated pricing workers and placing Envoy in front of them as a load balancer.

What You Get

  • A C++ pricing server built on QuantLib
  • A gRPC API using FlatBuffers messages
  • A JSON/HTTP gateway in jsonserver/
  • A C++ client in client/
  • A Python client package in quantra-python/

Supported Pricing Coverage

Representative supported request types include:

  • Fixed-rate bonds
  • Floating-rate bonds
  • Vanilla swaps
  • OIS swaps
  • Basis swaps
  • Zero-coupon inflation swaps
  • Year-on-year inflation swaps
  • FRAs
  • Caps and floors
  • Swaptions
  • CDS
  • Equity options

See examples/data/ for sample payloads.

Architecture

The main runtime model is a multi-process gRPC service fronted by Envoy:

JSON client -> json_server (:8080) -> Envoy (:50051) -> sync_server workers (:50055+)
gRPC client -----------------------> Envoy (:50051) -> sync_server workers (:50055+)

Performance

Measured on an AMD Ryzen 9 3900X (12 cores / 24 threads), 62 GiB RAM, Debian 13, Linux 6.1. Both benchmarks are informational (not part of the test gate) and live in tests/bench/.

Parallel throughput

Pricing the same request across N worker processes behind Envoy, versus pricing it single-threaded with QuantLib. Workload: one EUR multicurve swap (2 curves, 24 bootstrap helpers). Generated by tests/bench/run_throughput.sh.

Workers Throughput (req/s) Speedup vs 1 worker
1 8.5 1.0×
2 15.7 1.8×
4 31.0 3.6×
8 58.0 6.8×
12 75.4 8.9×

Single-threaded QuantLib reference: ~16 req/s. Scaling is near-linear up to the 12 physical cores.

Curve cache

Per-request latency with the curve cache off vs on (200 requests, mean). Generated by tests/bench/run_bench.sh. A cache hit reuses the bootstrapped curve and skips re-bootstrapping.

Workload No cache Cache Speedup
Bond (1 curve, 8 helpers) 2.11 ms 1.10 ms 1.9×
Swap (2 curves, 24 helpers) 117.12 ms 2.02 ms 57.9×

The gain scales with how much of the request is curve bootstrapping: large for a heavy multicurve with few instruments, small for a light single-curve request.

Quick Start

Container Image

The published GHCR image starts both the JSON API and the gRPC/Envoy endpoint:

  • HTTP/JSON API: 8080
  • gRPC/Envoy endpoint: 50051
docker pull ghcr.io/joseprupi/quantra-server:0.1.1

docker run --rm \
  -p 8080:8080 \
  -p 50051:50051 \
  ghcr.io/joseprupi/quantra-server:0.1.1

Check the running service:

curl http://localhost:8080/health
curl http://localhost:8080/meta

Change the worker count with QUANTRA_WORKERS:

docker run --rm \
  -e QUANTRA_WORKERS=2 \
  -p 8080:8080 \
  -p 50051:50051 \
  ghcr.io/joseprupi/quantra-server:0.1.1

The public API reference is available at https://quantra.io/docs/api.

Local Build

See docs/build.md for environment setup details. Once dependencies are available:

./scripts/build.sh Release
./scripts/quantra start --workers 4 --foreground
./build/jsonserver/json_server localhost:50051 8080

You can then call the HTTP API with sample requests from examples/data/:

curl -X POST http://localhost:8080/price-fixed-rate-bond \
  -H "Content-Type: application/json" \
  -d @examples/data/fixed_rate_bond_request.json

The generated OpenAPI files live in jsonserver/openapi/.

Development Workflow

Build

./scripts/build.sh regenerates schemas, recreates build/, and compiles the project.

./scripts/build.sh
./scripts/build.sh Release

Regenerate Schemas Only

If you are editing FlatBuffers schemas and want to regenerate artifacts without a full build:

./scripts/generate_schemas.sh

Run Tests

bash tests/run_all_tests.sh

The test suite exercises:

  • C++ pricing parity against QuantLib
  • C++ gRPC integration
  • JSON HTTP API scenarios
  • Python client scenarios

Repository Map

  • server/: gRPC pricing server
  • jsonserver/: HTTP/JSON gateway and generated OpenAPI docs
  • request/: request entrypoints and endpoint orchestration
  • parser/: parsing, domain conversion, pricing helpers, and builders
  • client/: C++ client library
  • quantra-python/: Python client package
  • flatbuffers/: schema sources plus generated C++, Python, and JSON artifacts
  • grpc/: gRPC service definitions and generated service bindings
  • examples/data/: example JSON requests
  • tests/: parity, integration, and client tests
  • scripts/: build, code generation, and runtime helpers
  • tools/quantra-manager/: packaged process-manager implementation
  • docs/: project documentation and reference notes

Documentation

  • docs/README.md: documentation index
  • docs/build.md: environment setup and build details
  • docs/scripts.md: build and schema tooling
  • docs/testing.md: test suite details
  • docs/process-manager.md: process-manager behavior and runtime model
  • docs/client.md: C++ client notes
  • docs/parser.md: parser/service/builder conventions
  • docs/versioning.md: versioning policy
  • CONTRIBUTING.md: contribution workflow

Requirements

The repository currently documents and builds around:

  • CMake 3.16+
  • GCC 12+ or Clang 14+
  • gRPC v1.60.0
  • FlatBuffers v24.12.23
  • QuantLib 1.41 in Docker builds
  • Envoy for worker load balancing

License

MIT / Apache 2.0