Quantra
Quantra is a QuantLib-based pricing service built for parallel execution. It exposes pricing functionality over gRPC with FlatBuffers and through an HTTP/JSON gateway for easier integration and generated OpenAPI documentation.
Why This Exists
QuantLib is powerful, but it is not naturally suited to high-concurrency service workloads because important state such as Settings::instance().evaluationDate() is global to the process. Quantra works around that by running multiple isolated pricing workers and placing Envoy in front of them as a load balancer.
What You Get
- A C++ pricing server built on QuantLib
- A gRPC API using FlatBuffers messages
- A JSON/HTTP gateway in
jsonserver/ - A C++ client in
client/ - A Python client package in
quantra-python/
Supported Pricing Coverage
Representative supported request types include:
- Fixed-rate bonds
- Floating-rate bonds
- Vanilla swaps
- OIS swaps
- Basis swaps
- Zero-coupon inflation swaps
- Year-on-year inflation swaps
- FRAs
- Caps and floors
- Swaptions
- CDS
- Equity options
See examples/data/ for sample payloads.
Architecture
The main runtime model is a multi-process gRPC service fronted by Envoy:
JSON client -> json_server (:8080) -> Envoy (:50051) -> sync_server workers (:50055+)
gRPC client -----------------------> Envoy (:50051) -> sync_server workers (:50055+)
Performance
Measured on an AMD Ryzen 9 3900X (12 cores / 24 threads), 62 GiB RAM, Debian 13, Linux 6.1. Both benchmarks are informational (not part of the test gate) and live in tests/bench/.
Parallel throughput
Pricing the same request across N worker processes behind Envoy, versus pricing it single-threaded with QuantLib. Workload: one EUR multicurve swap (2 curves, 24 bootstrap helpers). Generated by tests/bench/run_throughput.sh.
| Workers | Throughput (req/s) | Speedup vs 1 worker |
|---|---|---|
| 1 | 8.5 | 1.0× |
| 2 | 15.7 | 1.8× |
| 4 | 31.0 | 3.6× |
| 8 | 58.0 | 6.8× |
| 12 | 75.4 | 8.9× |
Single-threaded QuantLib reference: ~16 req/s. Scaling is near-linear up to the 12 physical cores.
Curve cache
Per-request latency with the curve cache off vs on (200 requests, mean). Generated by tests/bench/run_bench.sh. A cache hit reuses the bootstrapped curve and skips re-bootstrapping.
| Workload | No cache | Cache | Speedup |
|---|---|---|---|
| Bond (1 curve, 8 helpers) | 2.11 ms | 1.10 ms | 1.9× |
| Swap (2 curves, 24 helpers) | 117.12 ms | 2.02 ms | 57.9× |
The gain scales with how much of the request is curve bootstrapping: large for a heavy multicurve with few instruments, small for a light single-curve request.
Quick Start
Container Image
The published GHCR image starts both the JSON API and the gRPC/Envoy endpoint:
- HTTP/JSON API:
8080 - gRPC/Envoy endpoint:
50051
docker pull ghcr.io/joseprupi/quantra-server:0.1.1 docker run --rm \ -p 8080:8080 \ -p 50051:50051 \ ghcr.io/joseprupi/quantra-server:0.1.1
Check the running service:
curl http://localhost:8080/health curl http://localhost:8080/meta
Change the worker count with QUANTRA_WORKERS:
docker run --rm \ -e QUANTRA_WORKERS=2 \ -p 8080:8080 \ -p 50051:50051 \ ghcr.io/joseprupi/quantra-server:0.1.1
The public API reference is available at https://quantra.io/docs/api.
Local Build
See docs/build.md for environment setup details. Once dependencies are available:
./scripts/build.sh Release ./scripts/quantra start --workers 4 --foreground ./build/jsonserver/json_server localhost:50051 8080
You can then call the HTTP API with sample requests from examples/data/:
curl -X POST http://localhost:8080/price-fixed-rate-bond \
-H "Content-Type: application/json" \
-d @examples/data/fixed_rate_bond_request.jsonThe generated OpenAPI files live in jsonserver/openapi/.
Development Workflow
Build
./scripts/build.sh regenerates schemas, recreates build/, and compiles the project.
./scripts/build.sh ./scripts/build.sh Release
Regenerate Schemas Only
If you are editing FlatBuffers schemas and want to regenerate artifacts without a full build:
./scripts/generate_schemas.sh
Run Tests
bash tests/run_all_tests.sh
The test suite exercises:
- C++ pricing parity against QuantLib
- C++ gRPC integration
- JSON HTTP API scenarios
- Python client scenarios
Repository Map
server/: gRPC pricing serverjsonserver/: HTTP/JSON gateway and generated OpenAPI docsrequest/: request entrypoints and endpoint orchestrationparser/: parsing, domain conversion, pricing helpers, and buildersclient/: C++ client libraryquantra-python/: Python client packageflatbuffers/: schema sources plus generated C++, Python, and JSON artifactsgrpc/: gRPC service definitions and generated service bindingsexamples/data/: example JSON requeststests/: parity, integration, and client testsscripts/: build, code generation, and runtime helperstools/quantra-manager/: packaged process-manager implementationdocs/: project documentation and reference notes
Documentation
docs/README.md: documentation indexdocs/build.md: environment setup and build detailsdocs/scripts.md: build and schema toolingdocs/testing.md: test suite detailsdocs/process-manager.md: process-manager behavior and runtime modeldocs/client.md: C++ client notesdocs/parser.md: parser/service/builder conventionsdocs/versioning.md: versioning policyCONTRIBUTING.md: contribution workflow
Requirements
The repository currently documents and builds around:
- CMake
3.16+ - GCC
12+or Clang14+ - gRPC
v1.60.0 - FlatBuffers
v24.12.23 - QuantLib
1.41in Docker builds - Envoy for worker load balancing
License
MIT / Apache 2.0