GitHub - zenml-io/kitaru: Open-source platform layer for AI agents in production

4 min read Original article ↗

Kitaru

You build the agent. Kitaru runs everything around it.

Kitaru (来る, "to arrive") is the open-source platform layer for AI agents in production. Wrap an existing agent or write raw Python — Kitaru gives you checkpointed execution, human-in-the-loop waits, durable memory, and deployment on any cloud. Any framework. Any model.

PyPI Python License

Docs · Quick Start · Examples · Getting Started Guide · Roadmap · Community


Kitaru Dashboard

Your agent crashed at step 7. Kitaru replays from step 7 — not from scratch.

Add two decorators to your existing Python code and get crash recovery, human approval gates, durable memory, and a full dashboard. No rewrite. No graph DSL. No framework lock-in.

Why Kitaru?

Works with your agent SDK

Wrap an existing PydanticAI agent with KitaruAgent — no rewrite. For agents built on the OpenAI Agents SDK, Anthropic Agent SDK, or raw Python, use @flow and @checkpoint around your calls. Your model, your tools, your framework — Kitaru wraps them, not the other way around.

from kitaru import flow
from kitaru.adapters.pydantic_ai import KitaruAgent
from pydantic_ai import Agent

researcher = KitaruAgent(
    Agent("openai:gpt-5.4", system_prompt="You summarize research topics.")
)

@flow
def research_flow(topic: str) -> str:
    return researcher.run_sync(topic).output

Python-first, no graph DSL

Write normal Python. Use if, for, try/except — whatever your agent needs. Kitaru gives you two decorators (@flow and @checkpoint) and a handful of utility functions. That's it.

from kitaru import checkpoint, flow

@checkpoint
def research(topic: str) -> str:
    return do_research(topic)

@checkpoint
def write_draft(research: str) -> str:
    return generate_draft(research)

@flow
def writing_agent(topic: str) -> str:
    data = research(topic)
    return write_draft(data)

result = writing_agent.run("quantum computing").wait()

Durable execution and memory

Kitaru keeps agent state on disk and in infrastructure, not just in process memory. Checkpoints persist intermediate outputs so you can replay from failure, resume waiting runs, and inspect what happened. Durable memory adds scoped, versioned state for long-running agents across Python, CLI, client, and MCP surfaces.

Deploy on your cloud

No workers, no message queues, no distributed systems PhD required. Kitaru runs locally with zero config, and scales to production with a single server backed by a SQL database. Deploy your agents to Kubernetes, Vertex AI, SageMaker, or AzureML using Kitaru's stack abstraction. Your registry, your deployer, your infrastructure.

Built-in UI

Every execution is observable from day one. See your agent runs, inspect checkpoint outputs, and approve human-in-the-loop wait steps, all from a visual dashboard that ships with the Kitaru server.

To start the server locally, run kitaru login after installing kitaru[local]. To connect to an existing remote server, run kitaru login <server>.

Quick Start

Install

Or with uv (recommended):

To wrap a PydanticAI agent, install the adapter extra:

uv pip install "kitaru[pydantic-ai]"

Optional: start a local Kitaru server

Flows run locally by default with the base install. If you also want the local dashboard and REST API, install the local extra and then run bare kitaru login:

uv pip install "kitaru[local]"
kitaru login
kitaru status

Optional: connect to an existing remote Kitaru server

If you already have a deployed Kitaru server, connect to it explicitly:

kitaru login https://my-server.example.com
# add --project <PROJECT> or other remote-login flags if your setup requires them
kitaru status

Initialize your project

Write your first flow

# agent.py
from kitaru import checkpoint, flow

@checkpoint
def fetch_data(url: str) -> str:
    return "some data"

@checkpoint
def process_data(data: str) -> str:
    return data.upper()

@flow
def my_agent(url: str) -> str:
    data = fetch_data(url)
    return process_data(data)

result = my_agent.run("https://example.com").wait()
print(result)  # SOME DATA

Run it

Every checkpoint's output is persisted automatically. You can inspect what happened, replay from any checkpoint, or resume a waiting flow:

kitaru executions list
kitaru executions get <EXECUTION_ID>
kitaru executions logs <EXECUTION_ID>
kitaru executions replay <EXECUTION_ID> --from process_data

Learn more

Resource Description
Getting Started Guide Full setup walkthrough with all examples
Documentation Complete reference and guides
PydanticAI adapter Wrap a PydanticAI agent with KitaruAgent
Memory guide Durable memory concepts, scopes, history, and compaction
Examples Runnable workflows for every feature
Stacks Deploy to Kubernetes, AWS, GCP, or Azure

Origins

Kitaru is built by the team behind ZenML, drawing on five years of production orchestration experience (JetBrains, Adeo, Brevo). The orchestration primitives (stacks, artifacts, lineage) are purpose-rebuilt here for autonomous agents.

Contributing

We welcome contributions! See CONTRIBUTING.md for development setup, code style, and how to submit changes. The default branch is develop — all PRs should target it.

Community and support

  • Discussions — ask questions, share ideas
  • Issues — report bugs, request features
  • Roadmap — see what's coming next
  • Docs — guides and reference

License

Apache 2.0