π DAUT - Documentation Auto-Update Tool
AI-powered documentation generator that keeps your docs in sync with your code
DAUT scans your codebase, detects undocumented code, and automatically generates comprehensive documentation using LLM (Ollama). Perfect for maintaining up-to-date API docs, class references, and function documentation across Python, JavaScript, and TypeScript projects.
β¨ Features
- π Universal Code Scanner - Detects functions, classes, API endpoints across Python, JS, TS
- π€ AI Documentation Generation - Uses Ollama to generate human-readable docs
- π Live Progress Tracking - Real-time progress bars and statistics
- π― Smart File Detection - Respects .gitignore, skips venv/node_modules automatically
- πΎ ChromaDB Integration - Semantic search and context-aware documentation
- β‘ Resume Support - Skip already-generated docs, continue where you left off
- π MCP Server - Expose RAG capabilities to external agents (Claude, Cursor, etc.)
- π¨ Beautiful UI - Streamlit-based interface + powerful CLI
π§ RAG Strategy (Under the Hood)
DAUT uses a sophisticated structural indexing approach to ensure high-quality answers:
-
Unified Knowledge Base π All files, regardless of their folder depth, are indexed into a single project-wide collection (e.g.,
rag_enterprise_core_code). This prevents context fragmentation and ensures the AI sees the "Big Picture". -
Full-Content Embedding π Unlike simple splitters that chop text into arbitrary chunks, DAUT indexes the full content of your documentation files. This preserves the complete context of tutorials and guides.
-
Structure-Aware Code Indexing ποΈ Code is not just text. We parse the AST (Abstract Syntax Tree) to treat Classes, Functions, and API Endpoints as distinct semantic entities.
π Quick Start
Installation
# Clone and setup git clone <your-repo> cd doc_updater_app python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate pip install -r requirements.txt
Generate Docs in 3 Steps
# Launch UI
streamlit run src/ui/main.py- Select Project β Browse to your codebase
- Scan β Analyze code and find undocumented elements
- Generate β AI creates comprehensive docs
CLI Mode:
python -m src.docs_updater /path/to/project
πΈ Screenshots
Scan Progress
Analysis Dashboard
AI Documentation Generation
Generated Documentation Files
Documentation Files Overview
π― Use Cases
- API Documentation - Auto-generate REST API endpoint docs
- Code Onboarding - Help new developers understand your codebase
- Documentation Audits - Find and fix documentation gaps
- Legacy Code - Document undocumented legacy systems
- Continuous Docs - Keep docs in sync with code changes
π‘ Best Practices
Ensure High-Quality RAG Results
To avoid "diluting" the AI's knowledge base with outdated information:
- Prioritize
auto_docs: These files are generated directly from the current codebase and represent the "source of truth". - Exclude Legacy Docs: If you have an old
docs/folder with manual (potentially outdated) documentation, consider addingdocs/to the Exclude Patterns in the Filter Management sidebar. - Why? If the RAG system indexes both current code (via
auto_docs) and outdated manuals (viadocs/), it might retrieve conflicting information. By filtering out legacy docs, you ensure a "Pure Code-Truth" knowledge base.
π Example Output
Input: Python function
def get_session(session_id: str): """Retrieve session history.""" return db.query(session_id)
Generated Documentation:
## get_session ### Description The `get_session` API endpoint retrieves the conversation history for a specific session. Requires permission to view session history. ### Parameters | Name | Type | Default | |------|------|---------| | session_id | str | None | ### Return Value Returns the session history including session ID and message list. ### Example ```bash GET /sessions/12345
Error Handling
Returns 500 on errors, 403 if permission denied.
doc_updater_app/ βββ src/ β βββ core/ # Config management, project analysis β βββ scanner/ # Code & documentation scanners β βββ matcher/ # Discrepancy detection β βββ llm/ # Ollama integration β βββ chroma/ # ChromaDB vector store β βββ updater/ # Documentation update engine β βββ ui/ # Streamlit interface βββ requirements.txt βββ setup.py
## π§ Configuration
**service_config.json:**
```json
{
"ollama_host": "http://localhost:11434",
"chroma_host": "localhost",
"chroma_port": 8000,
"ollama_timeout": 120
}
π οΈ Requirements
- Python 3.9+
- Ollama (optional, for AI generation)
# Install: https://ollama.ai ollama pull llama3 - ChromaDB (optional, for semantic search)
pip install chromadb chroma run --path ./chromadb_data --port 8000
π Supported Languages & Formats
Code:
- Python (
.py) - JavaScript/TypeScript (
.js,.ts,.tsx,.jsx)
Documentation:
- Markdown (
.md) - reStructuredText (
.rst) - Plain text (
.txt)
π¨ Features in Detail
Smart Progress Tracking
π Scanning: [45/1234] 3.6% - api_service.py
[1/150] Verarbeite: get_session (api_endpoint)
β
Gespeichert: get_session.api.md
[2/150] Verarbeite: delete_session (api_endpoint)
βοΈ Γbersprungen (existiert): delete_session.api.md
Resume Support
Stop and restart anytime - already generated docs are automatically skipped!
Diskrepanz Analysis
-
Undocumented Code - Functions/classes without docs
-
Outdated Documentation - Docs that don't match current code
-
Mismatched Elements - Signature changes, parameter updates
-
Mismatched Elements - Signature changes, parameter updates
π MCP Server Integration
DAUT includes a Model Context Protocol (MCP) server, allowing you to connect external AI agents (like Claude Desktop, Cursor, or other LLMs) directly to your project's knowledge base.
Features
- Secure Access: Protected via API Key (Bearer Token).
- RAG Tools:
query_rag(query): Semantic search in your code and documentation.read_documentation_file(path): Read full content of generated docs.list_documentation_files(): List available documentation.
- Monitoring: Live connection tracking via the Web UI.
π Usage
Manual Start:
# Start the server (Default port: 8001)
./start_mcp.shAuto-Start (Systemd): Run as a background service that survives reboots:
chmod +x install_service.sh sudo ./install_service.sh
π Security & Configuration
The server requires an API Key for all requests. You MUST configure this to secure your data.
Setting the API Key:
- Edit
start_mcp.sh(for manual start) ordaut-mcp.service(for systemd). - Change the
MCP_API_KEYvariable:export MCP_API_KEY="your-secure-password-here"
- Restart the server.
Environment Variables:
| Variable | Default | Description |
|---|---|---|
MCP_PORT |
8001 |
Port for the MCP SSE endpoint |
MCP_HOST |
0.0.0.0 |
Bind address |
MCP_API_KEY |
secret-token-123 |
REQUIRED: Auth token for clients |
Connect a Client:
- URL:
http://<your-server-ip>:8001/mcp/sse - Auth: Header
Authorization: Bearer <your-key>
π€ Contributing
Contributions welcome! This project is under active development.
π License
MIT License - see LICENSE file for details
π Acknowledgments
π¦ Project Status
Current Version: 1.0.0 (Stable)
All core features implemented:
- β Universal code scanning
- β AI documentation generation
- β Progress tracking and resume support
- β ChromaDB integration
- β Streamlit UI + CLI
π Support
Found a bug or have a feature request? Open an issue!
Made with β€οΈ for developers who love good documentation





