Swival works with frontier models, but also strives to be as reliable as possible with smaller models,
including local ones. Designed from the ground up for tight context windows and limited resources.
Free, opensource, and easy to set up.
Run evals with Calibra — Swival's companion for benchmarking and evaluation.
$uv tool install swival
$swival "Refactor the error handling in src/api.py"
Why Swival
Reliable with small models
Context management is one of its strengths. Keeps things clean and focused, avoids unnecessary context bloat. Graduated compaction and persistent state mean the agent doesn't lose track of work, even under tight limits.
Your models, your way
Auto-discovers your LM Studio model, or point it at HuggingFace, OpenRouter, ChatGPT Plus/Pro, or any OpenAI-compatible server. You pick the model and the infrastructure.
Review loop and benchmarking
A configurable review loop with LLM-as-a-judge support. JSON reports capture timing, tool usage, and context events. Useful for comparing models, settings, skills, and MCP servers on real coding tasks.
Secrets stay on your machine
API keys and credentials in LLM messages are automatically encrypted before they leave your machine. The model never sees the real values. Decryption happens locally when the response comes back, so tools still work normally.
Cross-session memory
The agent remembers things across sessions.
Relevant past notes are retrieved via BM25
ranking, so context from earlier work carries
forward without bloating the prompt. Use
/learn to teach it on the spot.
Pick up where you left off
When a session is interrupted — Ctrl+C, max turns, context overflow — Swival saves its state to disk and resumes automatically next time you run it in the same directory.
A2A server mode
Run swival --serve and your agent
becomes an A2A endpoint that other agents can
call over HTTP. Multi-turn context, streaming,
and bearer auth are built in.
Skills, MCP, and more
Extend the agent with SKILL.md-based skills, MCP servers, and A2A agents. Small and hackable: a few thousand lines of Python, no framework. Read the whole thing in an afternoon.
Quickstart
LM Studio
- Install LM Studio and load a model with tool-calling support. Recommended first model: qwen3-coder-next (great quality/speed tradeoff on local hardware). Start the server.
-
Install Swival:
uv tool install swival -
Run:
swival "Refactor the error handling in src/api.py"
HuggingFace
-
Create a token at
huggingface.co/settings/tokens
and export it:
export HF_TOKEN=hf_... -
Install Swival:
uv tool install swival -
Run with a tool-calling model (e.g.
GLM-5):
swival "Refactor the error handling in src/api.py" \ --provider huggingface --model zai-org/GLM-5
OpenRouter
-
Sign up at openrouter.ai
and export your API key:
export OPENROUTER_API_KEY=sk_or_... -
Install Swival:
uv tool install swival -
Run with any model on the platform (e.g.
GLM-5):
swival "Refactor the error handling in src/api.py" \ --provider openrouter --model z-ai/glm-5
Other providers
Swival also works with ChatGPT Plus/Pro (uses your existing subscription via OAuth), Google Gemini, and any OpenAI-compatible server (ollama, llama.cpp, vLLM, etc.). See the Providers page for setup details.
Interactive mode
For back-and-forth sessions, start the REPL with
swival --repl. The agent keeps the full
conversation in memory, so you can iterate on a task across
multiple turns without repeating context.
swival --repl
Python library
Swival also works as a Python library, so you can embed an agent
loop directly in your own code. The simplest way is the
swival.run()
one-liner:
import swival
answer = swival.run(
"What files handle authentication?",
provider="openrouter",
model="z-ai/glm-5",
)
For multi-turn conversations or finer control, use the
Session
class:
from swival import Session
session = Session(provider="lmstudio", yolo=True)
result = session.run("Add type hints to utils.py")
print(result.answer)
# Continue the conversation
follow_up = session.ask("Now add tests for those functions")
print(follow_up.answer)
Session
accepts the same options you'd pass on
the command line — provider, model, allowed commands, MCP servers,
and everything else. Check the
usage docs
for the full list.