Local Agent is a local-first AI agent playground, vibe-coded over a weekend as an exploration project.
The goal of this project is learning and experimentation: understanding agent runtimes, tool safety models, RAG pipelines, memory management, and lightweight local persistence.
NOTE: Expect rough edges. This project exists to learn, prototype, and throw ideas away - not to serve as a production-ready framework.
Features
- Local-first: Runs entirely on your machine with your choice of LLM provider
- Ollama integration: Built-in support for local LLMs (llama3.1, mistral, etc.)
- RAG (Retrieval-Augmented Generation): Ingest documents and search knowledge base with semantic search
- Tool-based architecture: Extensible tools for filesystem, RAG, and more
- Safety by default: Deny-by-default permissions with explicit approval workflows
- Multi-provider support: Works with Ollama (local), Anthropic Claude, and OpenAI
- Sandboxed execution: Filesystem access restricted to configured workspaces
- Full audit trail: Every tool call logged with redaction of sensitive data
- Configurable policies: Define approval policies via YAML config
- Persistent conversations: SQLite-based storage for threads, messages, and sessions
- Thread management: Create, resume, list, and delete conversation threads
- Session tracking: Full execution context tracking with automatic cleanup
- Vector storage: Qdrant integration for document embeddings and semantic search
Identity & Memory
Identity
Agent identities define the system prompt and behavior of your agent. Local Agent ships with 3 built-in identities:
- default: A general-purpose helpful assistant
- nova (experimental): A minimal, adaptive identity that evolves through interaction by managing its own long-term memory (use it with caution!)
- principle_engineer: A senior engineer focused on best practices, architecture, and code quality
Memory Model (Nova Identity)
Note: This feature is experimental and may lead to unpredictable behavior. Use with caution.
The nova identity features an experimental memory management system that allows the agent to read, write, and update its long-term memory.
The idea was to explore how an agent could evolve its own identity and knowledge over time through interactions.
Because it can modify its own memory, Nova is not suitable for sensitive tasks or production use cases.
Installation
Prerequisites
- Python 3.12 or higher
- Ollama installed (recommended for local models)
- Qdrant via Docker (optional, for RAG features)
- OR an API key for cloud providers (Anthropic or OpenAI)
Install from source
# Clone the repository git clone https://github.com/danieljhkim/local-agent.git cd local-agent # Create a virtual environment python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate # Install in editable mode pip install -e .
Quick Start
1. Initialize configuration
This creates a config file at ~/.config/local-agent/config.yaml.
2. Configure your LLM provider
Option A: Ollama (recommended - fully local)
Install and start Ollama:
# Install Ollama from https://ollama.ai # Pull a model - llama3.1:70b if you can ollama pull llama3.1:8b # Use larger models if hardware permits # Start Ollama ollama serve
Edit the config file:
providers: - name: ollama model: llama3.1:8b base_url: http://localhost:11434 default_provider: ollama
Option B: Cloud providers (Anthropic/OpenAI)
Edit the config file and add your API key:
providers: - name: anthropic model: claude-3-5-sonnet-20241022 # api_key: sk-ant-... # Or set ANTHROPIC_API_KEY env var default_provider: anthropic
3. Configure workspace roots
Add directories the agent can access:
workspace: allowed_roots: - ~/repos - ~/documents - ~/projects
4. (Optional) Set up RAG with Qdrant
If you keep notes in one place, highly recommend using RAG so that your agent can be easily bootstrapped.
# Start Qdrant using Docker docker-compose up -d # Or manually: docker run -p 6333:6333 qdrant/qdrant # Ingest documents into the knowledge base agent ingest ~/repos/my-project --pattern "**/*.py" # Search the knowledge base agent rag query "how does authentication work" # Clear the collection agent rag clear --force
5. Initialize the SQLite database (Threads)
This will enable you "threads", persisted conversations which you can resume later.
# Initialize SQLite database for persistent conversations
agent db init6. Run the agent
# Interactive chat (ephemeral - not saved) agent chat # Create a new persistent thread agent threads new --title "My project work" # Resume an existing thread agent threads resume <thread-id> # List all threads agent threads list # Run a single task agent run "search for TODO comments in my project" # List available tools agent tools # View current config agent config --show # View database statistics agent db info
7. Agent Identities
List available identities:
View the currently active identity:
Switch to a different identity:
View identity content:
Create a custom identity:
# Interactive creation agent identity new my-custom-identity # Import from a file agent identity new coder --from-file ~/my-prompts/coding-assistant.txt
Delete a custom identity:
agent identity delete my-custom-identity
Note: Built-in identities (
defaultandnova) cannot be deleted. Custom identities are stored in~/.local/share/local-agent/identities/.
Identities persist across conversations, so you can switch between different agent personalities for different tasks.
Architecture
local-agent/
├── src/local_agent/
│ ├── cli.py # CLI entry point with all commands
│ ├── runtime/ # Agent orchestration
│ │ └── agent.py # AgentRuntime with agentic loop
│ ├── tools/ # Tool registry and implementations
│ │ ├── schema.py # Tool schemas (Pydantic)
│ │ ├── registry.py # Tool registry
│ │ ├── filesystem.py # Filesystem tools (6 tools)
│ │ └── rag.py # RAG tools (search)
│ ├── services/ # Business logic services
│ │ ├── embedding.py # Embedding generation (Ollama)
│ │ └── ingestion.py # Document ingestion pipeline
│ ├── policy/ # Permission and approval engine
│ │ ├── engine.py # Policy evaluation
│ │ └── approval.py # Approval workflow (CLI)
│ ├── config/ # Configuration
│ │ ├── schema.py # Config schema
│ │ └── loader.py # Config loading
│ ├── audit/ # Audit logging
│ │ └── logger.py # Structured audit logs (JSONL)
│ ├── persistence/ # SQLite persistence
│ │ ├── db.py # Database session management
│ │ ├── db_models.py # SQLAlchemy models (Thread, Message, Session, MessageMeta)
│ │ └── database_init.py # Database initialization
│ ├── providers/ # LLM providers
│ │ ├── base.py # Base provider interface
│ │ ├── factory.py # Provider factory pattern
│ │ ├── anthropic.py # Anthropic/Claude
│ │ ├── openai.py # OpenAI
│ │ └── ollama.py # Ollama (local models)
│ ├── connectors/ # External system connectors
│ │ ├── filesystem.py # Sandboxed filesystem access
│ │ └── qdrant.py # Qdrant vector database
│ └── web/ # FastAPI web service
│ ├── app.py # FastAPI application
│ └── routes/ # API routes
└── config/
└── example-config.yaml # Example configuration
Tool Risk Tiers
Tools are classified by risk level:
-
Tier 0 (Read-only):
fs_read_file,fs_list_dir,fs_search,rag_search- No approval required
- Safe operations with no side effects
-
Tier 1 (Drafting):
fs_apply_patch- No approval required
- Applies unified diff patches to files (safer than full rewrites)
- Automatically creates backup files
-
Tier 2 (Side-effectful):
fs_write_file,fs_delete_file- Requires explicit approval
- Modifies system state directly
Approval Policies
Define auto-approval rules in your config:
approval_policies: # Auto-approve all file reads - tool_pattern: "fs_read_file" auto_approve: true # Auto-approve writes only in sandbox - tool_pattern: "fs_write_file" auto_approve: true conditions: path: startswith: "/Users/me/sandbox" # Wildcard patterns work too - tool_pattern: "fs_*" auto_approve: true conditions: path: startswith: "/Users/me/safe-dir"
Audit Logs
Every tool call is logged to ~/.local/share/local-agent/logs/ as JSONL:
{
"timestamp": "2024-01-10T10:30:00",
"session_id": "abc123",
"thread_id": "thread-456",
"event_type": "tool_call",
"tool_name": "fs_read_file",
"risk_tier": "tier_0",
"parameters": {"path": "/Users/me/repos/project/main.py"},
"success": true,
"result_metadata": {"lines": 150, "chars": 4523},
"elapsed_ms": 12
}Sensitive content is automatically redacted based on patterns in your config.
Persistent Conversations
Conversation threads are stored in SQLite at ~/.local/share/local-agent/state/local_agent.db:
Tables:
threads- Conversation threads with title and timestampsmessages- User and assistant messagessessions- Execution sessions with turn/tool call countsmessage_meta- LLM response metadata (latency, tokens, tool calls)
Thread Management:
# Create a new thread agent threads new --title "Project debugging" # List recent threads agent threads list --limit 20 # Resume a thread (supports partial ID) agent threads resume 581c2d72 # Delete a thread agent threads delete 581c2d72 --force
Database Management:
# View database statistics agent db info # Initialize database agent db init # Reset database (DESTRUCTIVE) agent db reset --force
RAG (Retrieval-Augmented Generation)
Ingest documents and search your knowledge base:
Ingestion:
# Ingest all Python files from a directory agent ingest ~/repos/my-project --pattern "**/*.py" # Ingest markdown documentation agent ingest ~/docs --pattern "**/*.md" # Ingest with custom chunk size agent ingest ~/code --pattern "**/*.{py,js,ts}" --chunk-size 800 # Non-recursive ingestion agent ingest ~/config --pattern "*.yaml" --no-recursive
Searching:
# Search the knowledge base agent rag query "authentication implementation" # Get more results agent rag query "error handling" --limit 10 # View collection statistics agent rag stats # Clear the collection (DESTRUCTIVE) agent rag clear --force
Using RAG in conversations:
The rag_search tool is automatically available to the agent during chat sessions:
agent chat You: Search the codebase for how we handle user authentication # Agent automatically uses rag_search tool to find relevant code
Configuration:
Configure RAG in your config file:
qdrant: url: http://localhost:6333 collection_name: local_agent_docs vector_size: 768 # nomic-embed-text dimension embedding: model: nomic-embed-text:latest base_url: http://localhost:11434 # Ollama batch_size: 32 rag: chunk_size: 512 # Tokens per chunk chunk_overlap: 128 # Overlap between chunks top_k: 5 # Default results to return score_threshold: 0.0 # Minimum similarity score supported_extensions: # File types to ingest - .txt - .md - .py - .js - .ts - .json - .yaml - .yml
Development
Install development dependencies
Run tests
Code formatting and linting
ruff check . ruff format . mypy src/
License
MIT License





