GitHub - grandimam/barq: Free-threaded HTTP library

3 min read Original article ↗

⚠️ Experimental: This project is a proof-of-concept exploring free-threaded Python (PEP 703) for HTTP frameworks. Not production-ready.

A pure-Python HTTP framework built for free-threaded Python 3.13+. No async/await — just threads with true parallelism.

2-5x faster than FastAPI on real workloads.

Requirements

  • Python 3.13+ with free-threading enabled (python3.13t)
  • uv package manager

Installation

uv add barq

uv add barq[fast]

Development Setup

git clone https://github.com/grandimam/barq.git
cd barq

# Install
uv sync

# Run
uv run python examples/basic.py

# Test
curl http://localhost:8000/
curl http://localhost:8000/items/1
curl -X POST http://localhost:8000/items -H "Content-Type: application/json" -d '{"name":"Widget","price":9.99}'

Running Benchmarks

# Install dev dependencies
uv sync --dev

# Run benchmark
uv run python benchmarks/run_benchmark.py 1000 10

Quick Start

from typing import Annotated
from pydantic import BaseModel
from barq import Barq, Depends

app = Barq()

class Item(BaseModel):
    name: str
    price: float

@app.get("/")
def index() -> dict:
    return {"message": "Hello, World!"}

@app.get("/items/{item_id}")
def get_item(item_id: int) -> dict:
    return {"id": item_id}

@app.post("/items")
def create_item(body: Item) -> Item:
    return body

if __name__ == "__main__":
    app.run(host="127.0.0.1", port=8000, workers=4)

Features

  • Pure Python: No C extensions, no Rust, no Cython
  • Free-threaded: True parallelism without the GIL (Python 3.13t)
  • Type-driven: Pydantic models auto-parsed from request body
  • Dependency injection: Depends() with request-scoped caching
  • HTTP Keep-alive: Connection reuse for high throughput
  • Radix tree router: O(1) route matching
  • orjson support: Optional 3-5x faster JSON serialization
  • Minimal: ~500 lines of code in 5 files

Benchmarks

System

Component Value
CPU Apple M2 Pro
Cores 12
Python 3.13.0 (free-threaded)
Platform Darwin arm64

High Concurrency (2000 requests, 100 concurrent clients)

Scenario Free Threaded (16 threads) FastAPI (async) Difference
JSON 8,418 req/s 4,509 req/s Free Threaded: +87%
CPU Bound 1,425 req/s 266 req/s Free Threaded: +435%

Standard Load (1000 requests, 20 concurrent clients)

Scenario Free Threaded (4 threads) FastAPI (async) Difference
JSON 9,287 req/s 4,377 req/s Free Threaded: +112%
DB Query 8,284 req/s 2,302 req/s Free Threaded: +260%
CPU Bound 880 req/s 264 req/s Free Threaded: +233%

Thread Scaling (CPU-bound workload)

Workers req/s Scaling
4 608 1.0x
8 1,172 1.9x
16 1,297 2.1x
32 1,391 2.3x

Analysis

  • I/O-bound (JSON, DB): 2-3.5x faster due to simpler threading model and shared memory
  • CPU-bound: 5x faster — free-threaded Python enables true parallelism while async is single-threaded
  • Scales with cores: Adding threads directly improves CPU-bound throughput
  • Latency: Barq achieves lower p99 latency under load (no async task scheduling overhead)

Architecture

┌─────────────────────────────────────────────────────────┐
│                        Barq App                         │
│              (app.py: DI, validation, handlers)         │
├─────────────────────────────────────────────────────────┤
│                     Radix Router                        │
│              (router.py: O(1) route matching)           │
├─────────────────────────────────────────────────────────┤
│                    Request / Response                   │
│               (types.py: dataclasses)                   │
├─────────────────────────────────────────────────────────┤
│                      HTTP Parser                        │
│            (http.py: parse/write HTTP/1.1)              │
├─────────────────────────────────────────────────────────┤
│                   ThreadPoolExecutor                    │
│         (server.py: sockets, keep-alive, workers)       │
└─────────────────────────────────────────────────────────┘

Project Structure

src/barq/
├── __init__.py   # exports
├── app.py        # Barq, Depends, DI resolution
├── router.py     # RadixRouter, O(1) matching
├── types.py      # Request, Response, HTTPException
├── server.py     # Server, ThreadPool, keep-alive
└── http.py       # HTTPParser, write_response

Why Free-Threaded Python?

Traditional Python has the GIL (Global Interpreter Lock), which prevents true parallelism in threads. Web frameworks work around this using:

  • Async/await (FastAPI, Starlette): Cooperative multitasking
  • Multiprocessing (Gunicorn, uvicorn): Separate processes with IPC overhead

Free-threaded Python (PEP 703) removes the GIL, enabling:

  • Simple synchronous code that runs in parallel
  • Shared memory between threads (no serialization)
  • Lower overhead than multiprocessing

Limitations

  • Experimental and not battle-tested
  • HTTP/1.1 only (no HTTP/2, no WebSocket)
  • No middleware system (yet)
  • C extensions with internal locks don't parallelize

License

MIT