GitHub - danieljhkim/local-agent: An experimental, local-first AI agent runtime built to explore safety-first tooling, RAG, identity models, and local persistence.

Local Agent is a local-first AI agent playground, vibe-coded over a weekend as an exploration project.

The goal of this project is learning and experimentation: understanding agent runtimes, tool safety models, RAG pipelines, memory management, and lightweight local persistence.

NOTE: Expect rough edges. This project exists to learn, prototype, and throw ideas away - not to serve as a production-ready framework.

Features

Local-first: Runs entirely on your machine with your choice of LLM provider
Ollama integration: Built-in support for local LLMs (llama3.1, mistral, etc.)
RAG (Retrieval-Augmented Generation): Ingest documents and search knowledge base with semantic search
Tool-based architecture: Extensible tools for filesystem, RAG, and more
Safety by default: Deny-by-default permissions with explicit approval workflows
Multi-provider support: Works with Ollama (local), Anthropic Claude, and OpenAI
Sandboxed execution: Filesystem access restricted to configured workspaces
Full audit trail: Every tool call logged with redaction of sensitive data
Configurable policies: Define approval policies via YAML config
Persistent conversations: SQLite-based storage for threads, messages, and sessions
Thread management: Create, resume, list, and delete conversation threads
Session tracking: Full execution context tracking with automatic cleanup
Vector storage: Qdrant integration for document embeddings and semantic search

Identity & Memory

Identity

Agent identities define the system prompt and behavior of your agent. Local Agent ships with 3 built-in identities:

default: A general-purpose helpful assistant
nova (experimental): A minimal, adaptive identity that evolves through interaction by managing its own long-term memory (use it with caution!)
principle_engineer: A senior engineer focused on best practices, architecture, and code quality

Memory Model (Nova Identity)

Note: This feature is experimental and may lead to unpredictable behavior. Use with caution.

The nova identity features an experimental memory management system that allows the agent to read, write, and update its long-term memory.

The idea was to explore how an agent could evolve its own identity and knowledge over time through interactions.

Because it can modify its own memory, Nova is not suitable for sensitive tasks or production use cases.

Installation

Prerequisites

Python 3.12 or higher
Ollama installed (recommended for local models)
Qdrant via Docker (optional, for RAG features)
OR an API key for cloud providers (Anthropic or OpenAI)

Install from source

# Clone the repository
git clone https://github.com/danieljhkim/local-agent.git
cd local-agent

# Create a virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install in editable mode
pip install -e .

Quick Start

1. Initialize configuration

This creates a config file at ~/.config/local-agent/config.yaml.

2. Configure your LLM provider

Option A: Ollama (recommended - fully local)

Install and start Ollama:

# Install Ollama from https://ollama.ai
# Pull a model - llama3.1:70b if you can
ollama pull llama3.1:8b # Use larger models if hardware permits

# Start Ollama
ollama serve

Edit the config file:

providers:
  - name: ollama
    model: llama3.1:8b
    base_url: http://localhost:11434

default_provider: ollama

Option B: Cloud providers (Anthropic/OpenAI)

Edit the config file and add your API key:

providers:
  - name: anthropic
    model: claude-3-5-sonnet-20241022
    # api_key: sk-ant-...  # Or set ANTHROPIC_API_KEY env var

default_provider: anthropic

3. Configure workspace roots

Add directories the agent can access:

workspace:
  allowed_roots:
    - ~/repos
    - ~/documents
    - ~/projects

4. (Optional) Set up RAG with Qdrant

If you keep notes in one place, highly recommend using RAG so that your agent can be easily bootstrapped.

# Start Qdrant using Docker
docker-compose up -d

# Or manually:
docker run -p 6333:6333 qdrant/qdrant

# Ingest documents into the knowledge base
agent ingest ~/repos/my-project --pattern "**/*.py"

# Search the knowledge base
agent rag query "how does authentication work"

# Clear the collection
agent rag clear --force

5. Initialize the SQLite database (Threads)

This will enable you "threads", persisted conversations which you can resume later.

# Initialize SQLite database for persistent conversations
agent db init

6. Run the agent

# Interactive chat (ephemeral - not saved)
agent chat

# Create a new persistent thread
agent threads new --title "My project work"

# Resume an existing thread
agent threads resume <thread-id>

# List all threads
agent threads list

# Run a single task
agent run "search for TODO comments in my project"

# List available tools
agent tools

# View current config
agent config --show

# View database statistics
agent db info

7. Agent Identities

List available identities:

View the currently active identity:

Switch to a different identity:

View identity content:

Create a custom identity:

# Interactive creation
agent identity new my-custom-identity

# Import from a file
agent identity new coder --from-file ~/my-prompts/coding-assistant.txt

Delete a custom identity:

agent identity delete my-custom-identity

Note: Built-in identities (default and nova) cannot be deleted. Custom identities are stored in ~/.local/share/local-agent/identities/.

Identities persist across conversations, so you can switch between different agent personalities for different tasks.

Architecture

local-agent/
├── src/local_agent/
│   ├── cli.py              # CLI entry point with all commands
│   ├── runtime/            # Agent orchestration
│   │   └── agent.py        # AgentRuntime with agentic loop
│   ├── tools/              # Tool registry and implementations
│   │   ├── schema.py       # Tool schemas (Pydantic)
│   │   ├── registry.py     # Tool registry
│   │   ├── filesystem.py   # Filesystem tools (6 tools)
│   │   └── rag.py          # RAG tools (search)
│   ├── services/           # Business logic services
│   │   ├── embedding.py    # Embedding generation (Ollama)
│   │   └── ingestion.py    # Document ingestion pipeline
│   ├── policy/             # Permission and approval engine
│   │   ├── engine.py       # Policy evaluation
│   │   └── approval.py     # Approval workflow (CLI)
│   ├── config/             # Configuration
│   │   ├── schema.py       # Config schema
│   │   └── loader.py       # Config loading
│   ├── audit/              # Audit logging
│   │   └── logger.py       # Structured audit logs (JSONL)
│   ├── persistence/        # SQLite persistence
│   │   ├── db.py           # Database session management
│   │   ├── db_models.py    # SQLAlchemy models (Thread, Message, Session, MessageMeta)
│   │   └── database_init.py # Database initialization
│   ├── providers/          # LLM providers
│   │   ├── base.py         # Base provider interface
│   │   ├── factory.py      # Provider factory pattern
│   │   ├── anthropic.py    # Anthropic/Claude
│   │   ├── openai.py       # OpenAI
│   │   └── ollama.py       # Ollama (local models)
│   ├── connectors/         # External system connectors
│   │   ├── filesystem.py   # Sandboxed filesystem access
│   │   └── qdrant.py       # Qdrant vector database
│   └── web/                # FastAPI web service
│       ├── app.py          # FastAPI application
│       └── routes/         # API routes
└── config/
    └── example-config.yaml # Example configuration

Tool Risk Tiers

Tools are classified by risk level:

Tier 0 (Read-only): fs_read_file, fs_list_dir, fs_search, rag_search
- No approval required
- Safe operations with no side effects
Tier 1 (Drafting): fs_apply_patch
- No approval required
- Applies unified diff patches to files (safer than full rewrites)
- Automatically creates backup files
Tier 2 (Side-effectful): fs_write_file, fs_delete_file
- Requires explicit approval
- Modifies system state directly

Approval Policies

Define auto-approval rules in your config:

approval_policies:
  # Auto-approve all file reads
  - tool_pattern: "fs_read_file"
    auto_approve: true

  # Auto-approve writes only in sandbox
  - tool_pattern: "fs_write_file"
    auto_approve: true
    conditions:
      path:
        startswith: "/Users/me/sandbox"

  # Wildcard patterns work too
  - tool_pattern: "fs_*"
    auto_approve: true
    conditions:
      path:
        startswith: "/Users/me/safe-dir"

Audit Logs

Every tool call is logged to ~/.local/share/local-agent/logs/ as JSONL:

{
  "timestamp": "2024-01-10T10:30:00",
  "session_id": "abc123",
  "thread_id": "thread-456",
  "event_type": "tool_call",
  "tool_name": "fs_read_file",
  "risk_tier": "tier_0",
  "parameters": {"path": "/Users/me/repos/project/main.py"},
  "success": true,
  "result_metadata": {"lines": 150, "chars": 4523},
  "elapsed_ms": 12
}

Sensitive content is automatically redacted based on patterns in your config.

Persistent Conversations

Conversation threads are stored in SQLite at ~/.local/share/local-agent/state/local_agent.db:

Tables:

threads - Conversation threads with title and timestamps
messages - User and assistant messages
sessions - Execution sessions with turn/tool call counts
message_meta - LLM response metadata (latency, tokens, tool calls)

Thread Management:

# Create a new thread
agent threads new --title "Project debugging"

# List recent threads
agent threads list --limit 20

# Resume a thread (supports partial ID)
agent threads resume 581c2d72

# Delete a thread
agent threads delete 581c2d72 --force

Database Management:

# View database statistics
agent db info

# Initialize database
agent db init

# Reset database (DESTRUCTIVE)
agent db reset --force

RAG (Retrieval-Augmented Generation)

Ingest documents and search your knowledge base:

Ingestion:

# Ingest all Python files from a directory
agent ingest ~/repos/my-project --pattern "**/*.py"

# Ingest markdown documentation
agent ingest ~/docs --pattern "**/*.md"

# Ingest with custom chunk size
agent ingest ~/code --pattern "**/*.{py,js,ts}" --chunk-size 800

# Non-recursive ingestion
agent ingest ~/config --pattern "*.yaml" --no-recursive

Searching:

# Search the knowledge base
agent rag query "authentication implementation"

# Get more results
agent rag query "error handling" --limit 10

# View collection statistics
agent rag stats

# Clear the collection (DESTRUCTIVE)
agent rag clear --force

Using RAG in conversations:

The rag_search tool is automatically available to the agent during chat sessions:

agent chat

You: Search the codebase for how we handle user authentication
# Agent automatically uses rag_search tool to find relevant code

Configuration:

Configure RAG in your config file:

qdrant:
  url: http://localhost:6333
  collection_name: local_agent_docs
  vector_size: 768  # nomic-embed-text dimension

embedding:
  model: nomic-embed-text:latest
  base_url: http://localhost:11434  # Ollama
  batch_size: 32

rag:
  chunk_size: 512          # Tokens per chunk
  chunk_overlap: 128       # Overlap between chunks
  top_k: 5                 # Default results to return
  score_threshold: 0.0     # Minimum similarity score
  supported_extensions:    # File types to ingest
    - .txt
    - .md
    - .py
    - .js
    - .ts
    - .json
    - .yaml
    - .yml

Development

Install development dependencies

Run tests

Code formatting and linting

ruff check .
ruff format .
mypy src/

License

MIT License