Drop-in API-compatible gRPC for Python
grpyc is a drop-in replacement for grpcio 1.80, built in Rust. Up to 8x throughput on GKE, 2x lower latency, zero memory leaks. Runs everywhere from Raspberry Pi to IBM Z mainframes. Change one import.
Up to 8x
Faster than grpcio
814 tok/s
vLLM streaming throughput
grpyc delivers tokens faster than REST or grpcio in simulated vLLM inference with 50-token responses.
2.2x
Lower P50 latency
Lower transport latency means requests reach the GPU faster, improving batch fill rates and overall GPU utilization.
Better batching
Higher GPU efficiency
Faster request delivery fills GPU batches more efficiently. Less time waiting on Python means more tokens per second per dollar.
Python API
↓
Cython Wrappers
↓
C Core Shim
↓
gRPC C Core
↓
OS / Network
Python API (compatible surface)
↓
PyO3 Rust ↔ Python
↓
h2 + rustls HTTP/2 + TLS
↓
Tokio async I/O
↓
OS / Network
- Memory safe by default — Rust ownership eliminates entire vulnerability classes
- Minimal GIL contention — I/O and serialization in Rust, GIL released during network ops
- No C toolchain — prebuilt native wheels, no OpenSSL, installs in seconds
- No memory leaks — impossible by design, not just tested
Edge & IoT
Raspberry Pi · ARM64 · ARMv7
→
Cloud & servers
x86_64 · ARM64 · Alpine/musl
→
IBM Z mainframe
s390x · no prebuilt grpcio wheel
Linux x86_64 ARM64 / aarch64 ARMv7 · Raspberry Pi 32-bit x86 IBM Z · s390x Alpine / musl macOS · Apple Silicon Windows 64 / 32-bit
Tokio Async Runtime
All I/O runs on the Tokio runtime, completely outside Python's GIL. True async without contention — the #1 source of tail latency in grpcio is gone.
Memory Safe by Design
Rust's ownership model eliminates use-after-free, buffer overflows, and data races at compile time. No more memory leaks under load. Security audited.
xDS Service Mesh
Full xDS support — LDS, RDS, CDS, EDS — for proxyless gRPC. Connect directly to your control plane. No sidecar proxy overhead.
TLS / mTLS via rustls
Modern TLS via rustls — no OpenSSL, no dependency conflicts. Mutual TLS for zero-trust architectures.
Intelligent Load Balancing
Round-robin, ring-hash, weighted round-robin, outlier detection — all built in. ORCA load reporting for advanced traffic management.
All 4 Streaming Modes
Unary, server streaming, client streaming, bidirectional — all fully async through Tokio. Flow control and backpressure handled in Rust.
Drop-in Compatible
Same grpc Python API. Change one import line. Your existing protobuf definitions, handlers, and interceptors work unchanged.
Raspberry Pi to IBM Z
Prebuilt wheels for Linux x86_64, ARM64, ARMv7 and 32-bit — plus IBM Z (s390x), where standard grpcio ships no prebuilt wheel. macOS and Windows too. Prebuilt — no build step.
AI & ML Inference
Ship faster inference — Model serving frameworks (vLLM, Triton, TensorFlow Serving) rely on gRPC between clients and inference workers. grpyc's Tokio runtime eliminates the gRPC overhead that wastes expensive GPU time.
- Tighter latency → fewer retries → more effective GPU utilization
- Lower memory per connection → more room for model weights
- Streaming support for token-by-token generation
Google Cloud
Accelerate every GCP API call — BigQuery, Pub/Sub, Spanner, Firestore — the Google Cloud Python SDK uses gRPC for every call. grpyc replaces that transport layer with Rust-powered performance, zero code changes.
- Higher throughput for BigQuery Storage API reads
- Faster Pub/Sub publish and subscribe
- Predictable Spanner latency for real-time apps
Service Mesh at Scale
Scale your service mesh without limits — In a mesh with hundreds of Python gRPC services, one slow service cascades into timeouts across the network. grpyc's predictable latency breaks the cascade chain.
- Proxyless gRPC via built-in xDS — eliminate sidecar overhead
- Tight P50–P99 spread keeps timeouts meaningful
- Built-in load balancing, health checking, outlier detection
Async Unary QPS (Java server) GKE c2-standard-8 — higher is better
Unary Latency (P50) GKE ping-pong — lower is better
Cross-node QPS (c=64) GKE e2-standard-4 — higher is better
vLLM tokens/sec (c=8) simulated inference, 50 tokens
GKE c2-standard-8 & e2-standard-4, Go client, Python 3.13 servers. Baseline: grpcio 1.80.0. Cross-node anti-affinity. Full benchmark dashboard →
Memory safe by design
Rust ownership rules out use-after-free, buffer overflows, and data races at compile time — the class of C/C++ vulnerabilities that haunts grpcio's C core is gone from grpyc's Rust gRPC stack.
Fuzzed continuously
libFuzzer runs around the clock across HTTP/2 framing, HPACK, the gRPC message and timeout codecs, xDS config parsing, and SPIFFE identity handling. Adversarial input is part of CI, not an afterthought.
We harden the foundation
Our fuzzing uncovered a subtle, adversarially-triggerable crash in h2 — the production-grade Rust HTTP/2 stack used across the ecosystem — reproducible from a 59-byte frame sequence. We reported it and fixed it.
Reviewed & audited
Every change is peer-reviewed before merge. The codebase is security-audited with zero unsafe code in the hot path, and cargo-audit + cargo-deny scan the supply chain on every release.
Rust gRPC Stack
47 Rust modules implementing the complete gRPC protocol. h2 for HTTP/2 framing, rustls for TLS, prost for protobuf. Native Rust — not a Cython wrapper around a C core.
Tokio-Powered Async
All network I/O runs on the Tokio runtime outside Python's GIL. PyO3 bridges Rust futures to asyncio seamlessly. Batched event delivery minimizes GIL acquisition overhead.
Memory Safe by Design
Rust's ownership model prevents use-after-free, buffer overflows, and data races at compile time. Zero unsafe code in the hot path. Security audited with all findings addressed.
Runs everywhere your fleet does
Prebuilt wheels for Linux x86_64, ARM64, ARMv7, 32-bit and IBM Z (s390x), plus macOS and Windows — five glibc tiers and Alpine/musl. Prebuilt, installs in seconds. See full platform support →
Pro
Annual License
For teams running gRPC in production
- Full feature set
- Commercial production use
- Email support
- Migration guidance
Most Popular
Enterprise
Annual License
Production use with enterprise support
- Commercial production use
- Priority support with SLA
- Architecture reviews & optimization
- Dedicated Slack channel
- Migration assistance
OEM / SDK Vendor
Custom
Embed grpyc in your product
- Redistribution rights
- Co-engineering support
- Priority patches & builds
- Custom build configurations
Your Python fleet is up to 8x faster with one change.
Free for evaluation. Prove it on your workload, then talk to us.