SQLite RAG
A hybrid search engine built on SQLite with SQLite AI and SQLite Vector extensions. SQLite RAG combines vector similarity search with full-text search (FTS5 extension) using Reciprocal Rank Fusion (RRF) for enhanced document retrieval.
Features
- Hybrid Search: Combines vector embeddings with full-text search for optimal results
- SQLite-based: Built on SQLite with AI and Vector extensions for reliability and performance
- Multi-format Text Support: Process text file formats including PDF, DOCX, Markdown, code files
- Recursive Character Text Splitter: Token-aware text chunking with configurable overlap
- Interactive CLI: Command-line interface with interactive REPL mode
- Flexible Configuration: Customizable embedding models, search weights, and chunking parameters
Installation
Prerequisites
SQLite RAG requires SQLite with extension loading support.
If you encounter extension loading issues (e.g., 'sqlite3.Connection' object has no attribute 'enable_load_extension'), follow the setup guides for macOS or Windows.
Install SQLite RAG
python3 -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate pip install sqlite-rag
Quick Start
Download the model Embedding Gemma from Hugging Face chosen as default model:
sqlite-rag download-model unsloth/embeddinggemma-300m-GGUF embeddinggemma-300M-Q8_0.gguf
SQLite RAG comes preconfigured to work with the Embedding Gemma model. When you add a document or text, it automatically creates a new database (if one does not already exist) and uses default settings, so you can get started immediately without manual setup.
# Initialize sqliterag.sqlite database and add documents sqlite-rag add-text "Artificial intelligence (AI) enables machines to learn from data" sqlite-rag add /path/to/documents --recursive # Search your documents sqlite-rag search "explain AI" # Interactive mode sqlite-rag > help > search "interactive search" > exit
For help run:
CLI Commands
Configuration
Settings are stored in the database and should be set before adding any documents.
# View available configuration options sqlite-rag configure --help sqlite-rag configure --model-path ./mymodels/path # View current settings sqlite-rag settings
To use a different database filename, use the global --database option:
# Single command with custom database sqlite-rag --database path/to/mydb.db add-text "Let's talk about AI." # Interactive mode with custom database sqlite-rag --database path/to/mydb.db
Model Management
You can experiment with other models from Hugging Face by downloading them with:
# Download GGUF models from Hugging Face sqlite-rag download-model <model-repo> <filename>
Supported File Formats
SQLite RAG supports the following file formats:
- Text:
.txt,.md,.mdx,.csv,.json,.xml,.yaml,.yml - Documents:
.pdf,.docx,.pptx,.xlsx - Code:
.c,.cpp,.css,.go,.h,.hpp,.html,.java,.js,.mjs,.kt,.php,.py,.rb,.rs,.swift,.ts,.tsx - Web Frameworks:
.svelte,.vue
Development
Installation
For development, clone the repository and install with development dependencies:
# Clone the repository git clone https://github.com/sqliteai/sqlite-rag.git cd sqlite-rag # Create virtual environment python3 -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate # Install in development mode pip install -e '.[dev]'
How It Works
- Document Processing: Files are processed and split into overlapping chunks
- Embedding Generation: Text chunks are converted to vector embeddings using AI models
- Dual Indexing: Content is indexed for both vector similarity and full-text search
- Hybrid Search: Queries are processed through both search methods
- Result Fusion: Results are combined using Reciprocal Rank Fusion for optimal relevance
