Pi provider extension for running antirez/ds4 as a local DeepSeek V4 Flash model. The goal here is to see how good the UX and behavior can be around local models.
The extension registers ds4/deepseek-v4-flash and
ds4/deepseek-v4-flash-q2-imatrix as models for /model, starts ds4-server
on demand, downloads/builds the runtime if needed, keeps a per-pi-process lease,
and stops the server via a bundled watchdog when no clients are left.
Requirements and Behavior
You will need a mac with at least 128GB of RAM. The default
ds4/deepseek-v4-flash model installs the 2-bit quantized model if you have
128GB of RAM and picks the 4-bit quantized model if you have 256GB or more.
Select ds4/deepseek-v4-flash-q2-imatrix to use the imatrix-tuned q2 model.
If you are signed into huggingface then your token is used for faster downloads. The server is compiled/started and models are downloaded automatically on first use.
Install
pi install https://github.com/mitsuhiko/pi-ds4
For local development from this checkout, pass the path to an existing ds4 server checkout:
./install-pi-extension-local.sh /path/to/antirez-ds4-checkout
If ~/.pi/ds4/support already exists and points elsewhere, use --force to
move it aside and install a symlink to the checkout you passed. Any existing
gguf/*.gguf model files (and resumable .gguf.part downloads) are preserved
into the new checkout first, using APFS clone-on-write copies on macOS when
available.
Then restart pi or run /reload.
Runtime layout
Runtime state is kept under ~/.pi/ds4:
support/— shallow checkout ofhttps://github.com/antirez/ds4(mainby default)kv/— on-disk KV cache for the default model choicekv-q2-imatrix/— on-disk KV cache for the q2-imatrix model choiceclients/— active pi process leasessettings.json— optional extension configuration overrideslog— build/download/server/watchdog log
The watchdog is bundled in this package (ds4-watchdog.sh), not expected to
exist in the ds4 runtime checkout.
Configuration
Environment overrides can also be placed in ~/.pi/ds4/settings.json. In the
JSON file, use the env var name (for example "DS4_READY_TIMEOUT_MS"), the
camel-case key without DS4_ (for example "readyTimeoutMs"), or the lower
snake-case key without DS4_ (for example "ready_timeout_ms"). Environment
variables win over the settings file.
DS4_PROTOCOL: Pi wire protocol. Supported values areopenai(default, OpenAI Chat Completions),openai-responses, andanthropic.DS4_SUPPORT_REPO: runtime repo URL (defaulthttps://github.com/antirez/ds4)DS4_SUPPORT_BRANCH: runtime branch (defaultmain)DS4_RUNTIME_DIR: use an existing ds4 checkout instead of~/.pi/ds4/supportDS4_MODEL_QUANT: forceq2,q2-imatrix, orq4for the default model choice (otherwise picked from system memory)DS4_READY_TIMEOUT_MS: server startup timeoutDS4_SERVER_BINARY: customds4-serverbinary pathDS4_WATCHDOG_SCRIPT: custom watchdog script pathDS4_API_KEY: provider API key/token sent by Pi (defaultdsv4-local)
See settings.example.json for a complete example with a JSON schema reference.
A minimal ~/.pi/ds4/settings.json can look like this:
{
"$schema": "https://raw.githubusercontent.com/mitsuhiko/pi-ds4/main/settings.schema.json",
"protocol": "openai-responses",
"modelQuant": "q2-imatrix",
"readyTimeoutMs": 900000
}Use /ds4 inside pi to show the live ds4 log.