Named Pipes as Agentic Tools
Low-latency IPC for persistent AI tool servers — LLM inference, TTS, STT, vector search, and more — all on one machine, no network stack required.
✨ Highlights
- Persistent servers — model weights and state stay loaded between calls; no per-request startup cost
- Kernel-speed IPC — named pipes route through kernel memory, not a network stack; lower latency than local HTTP
- Multi-client fanout — one server handles many concurrent clients; each gets its own downstream pipe
- Decorator API — register command handlers with a single
@ch.handler("CMD")line cpipeCLI — send ad-hoc commands to any running server from the terminal, likecurlfor pipes- Claude Code skill — an included skill teaches the assistant to discover and query live servers without leaving the session
- Ready-made servers — drop-in pipes for LLM chat, text-to-speech, and speech-to-text
Overview
This library uses named pipes as the transport layer for agentic tool servers — persistent background processes that expose capabilities such as LLM inference, text-to-speech, vector search, or browser automation to a Python orchestrator running on the same machine.
Because named pipes route data through kernel memory rather than a network stack, they offer lower latency than local HTTP and far less complexity than shared memory — a practical sweet spot for real-time applications like voice agents.
The same servers can be driven directly from Claude Code. An included agent skill teaches the assistant how to discover running pipe servers with cpipe --list, inspect their capabilities, and send commands.
For a deeper look at the design decisions and API reference, see DOCS.md.
Installation
# Core library only pip install -e . # With LLM inference support pip install -e ".[llm]" # With TTS support (macOS: mlx-audio + sounddevice) pip install -e ".[tts]" # With STT support (sounddevice; Voxtral weights vendored) pip install -e ".[stt]"
Requires Python 3.11+. See DOCS.md for platform-specific dependency details.
Quick start
1. Start a server (Terminal 1):
conda activate named-pipes
cpipe --serve chat # LLM server on /tmp/tool-chat2. Query it from the CLI (Terminal 2):
cpipe /tmp/tool-chat chat --data '{"messages": [{"role":"user","content":"Hello!"}]}'3. Or write a client in Python:
from named_pipes.tool_client import ToolClient import threading class _ChatClient(ToolClient): def on_message(self, msg): if msg.get("done") is not True: print(msg.get("result", ""), end="", flush=True) done = threading.Event() with _ChatClient("chat") as ch: ch.send_command("chat", messages=[{"role": "user", "content": "Hello!"}]) done.wait(timeout=30)
Examples
Start order matters — server first, then client (server creates the FIFOs).
# LLM chat cpipe --serve chat # Terminal 1 python src/examples/chat_client.py # Terminal 2 # LLM → TTS pipeline (spoken output) cpipe --serve chat # Terminal 1: LLM (/tmp/tool-chat) cpipe --serve tts # Terminal 2: TTS (/tmp/tool-tts) python src/examples/tts_client.py # Terminal 3: pipeline client # Speech-to-text cpipe --serve stt # Terminal 1: STT (/tmp/tool-stt) python src/examples/stt_client.py # Terminal 2: subscriber
cpipe — CLI tool
cpipe /tmp/tool-chat chat --data '{"messages": [{"role":"user","content":"Hello"}]}' cpipe --version # show installed version cpipe --list # discover running ToolServer instances (tool-* pipes) cpipe --pid # same, plus PIDs that have each pipe open cpipe --clear # delete orphaned tool pipes
See DOCS.md for all options and the full protocol reference.
Claude Code skill
An included skill at .claude/skills/cpipe/SKILL.md teaches Claude Code how to use cpipe to discover, inspect, and interact with live servers — so the LLM can query a local inference server or trigger TTS playback without leaving the coding session.
Resources
- DOCS.md — architecture, API reference, protocol spec, and design rationale
named-pipe-tools.md—ToolServerprotocol specificationsrc/examples/chat_client.py— LLM chat examplesrc/examples/tts_client.py— TTS examplesrc/examples/stt_client.py— STT example
