Semantic search for Obsidian vaults. Index your vault into vector embeddings, then search by meaning rather than keywords.
Using this with an AI agent (Claude Code, Cursor, etc.)? See SKILL.md for agent-facing guidance — score interpretation, workflows, and known gotchas.
Install
# As a standalone CLI (recommended) uv tool install obsidian-semantic # Or with pipx (also installs into an isolated environment) pipx install obsidian-semantic # With Gemini embedder support uv tool install "obsidian-semantic[gemini]"
Then configure:
obsidian-semantic configure
Configuration is stored in ~/.config/obsidian-semantic/config.yaml. Supports Ollama (local), LM Studio (local), and Gemini embedders.
From source
git clone https://github.com/ravila4/obsidian-semantic-search
cd obsidian-semantic-search
uv sync
uv run obsidian-semantic configureUsage
Index your vault
obsidian-semantic index # incremental (new/modified files only) obsidian-semantic index --full # reindex everything
Search
obsidian-semantic search "dependency injection patterns" obsidian-semantic search "python testing" --limit 5 obsidian-semantic search "docker" --folder "Programming/" obsidian-semantic search "habits" --tag "review" obsidian-semantic search "fisher" --score-min 0.6 # drop low-relevance hits obsidian-semantic search "fisher" --per-file 0 # show every matching chunk
By default, results are deduped to one chunk per file. Pass --per-file N to allow up to N chunks per file (or 0 for unlimited).
--score-min thresholds need to account for dedup: the second-best file's surviving chunk often scores ~0.05–0.10 lower than the duplicate chunks it displaced, so a threshold tuned against raw chunk scores can drop relevant notes. Calibrate against the post-dedup output. Useful absolute bands on ollama+nomic are roughly: ≥0.65 strong title-level match, ≥0.5 topical, <0.4 likely noise. Other embedders (qwen3, gemini) sit on different scales.
Find related notes
Find notes similar to a given note, useful for discovering connections, linking, or deduplication.
obsidian-semantic related "Programming/Python/Unit Testing.md" obsidian-semantic related "Daily/2026-02-05.md" --limit 5
If the note isn't in the index, it's chunked and embedded on the fly.
Show a note
Print the full contents of a note straight to stdout. Accepts a vault-relative path or a bare filename (with or without .md); if the basename is unique, it's resolved automatically. Reads from disk, so it works on un-indexed files too (unlike search).
obsidian-semantic show "Fisher's Exact in Empiroar.md" obsidian-semantic show "Programming/Python/Unit Testing.md" obsidian-semantic show "Unit Testing.md#Setup#Installation" # specific section
Append #Heading (or #Parent#Child for nested sections) to print just that section. Heading paths are matched against the breadcrumb suffix and are case-insensitive; ambiguous headings are listed with line numbers.
Suggest missing links
Find semantically similar notes that aren't linked to each other -- surfaces missing wikilinks and potential duplicates.
obsidian-semantic suggest-links
obsidian-semantic suggest-links --threshold 0.85 --limit 10
obsidian-semantic suggest-links --exclude-same-folder "Daily Log"Folders to exclude can also be set in config so you don't have to type them every time:
suggest_links: exclude_same_folder: - "Daily Log"
Status
Options
All commands accept --vault <path> to specify the vault. Alternatively, set OBSIDIAN_VAULT or configure a default with obsidian-semantic configure --vault <path>.
Embedding Backends
Configuration lives in ~/.config/obsidian-semantic/config.yaml. You can also place a .obsidian-semantic.yaml in your vault root to override per-vault.
After changing the embedder or model, reindex with obsidian-semantic index --full.
Ollama with Nomic (default)
Local embeddings with nomic-embed-text (768 dimensions). Uses search_query:/search_document: prefixes for asymmetric retrieval.
vault: ~/Documents/Obsidian-Notes embedder: type: ollama model: nomic-embed-text dimension: 768 query_prefix: "search_query: " document_prefix: "search_document: "
ollama pull nomic-embed-text
Ollama with Qwen3-embedding
Higher-quality embeddings with qwen3-embedding (4096 dimensions). Uses an instruction prefix for queries to improve retrieval.
vault: ~/Documents/Obsidian-Notes embedder: type: ollama model: qwen3-embedding:8b dimension: 4096 query_prefix: "Instruct: Given a search query, retrieve relevant notes\nQuery: "
ollama pull qwen3-embedding:8b
LM Studio
Local embeddings via LM Studio's OpenAI-compatible API (/v1/embeddings on port 1234). Start the server first:
LM Studio with Nomic
vault: ~/Documents/Obsidian-Notes embedder: type: lmstudio model: text-embedding-nomic-embed-text-v1.5 dimension: 768 query_prefix: "search_query: " document_prefix: "search_document: "
lms get -y nomic-ai/nomic-embed-text-v1.5
LM Studio with Qwen3-embedding
Higher-quality embeddings (4096 dimensions). Like the Ollama variant, uses an instruction prefix for queries to improve retrieval.
vault: ~/Documents/Obsidian-Notes embedder: type: lmstudio model: text-embedding-qwen3-embedding-8b dimension: 4096 query_prefix: "Instruct: Given a search query, retrieve relevant notes\nQuery: "
Gemini
Cloud embeddings via Google's gemini-embedding-001 (3072 dimensions). Handles query vs. document task types automatically -- no prefix config needed. Requires a GEMINI_API_KEY environment variable.
vault: ~/Documents/Obsidian-Notes embedder: type: gemini model: gemini-embedding-001 dimension: 3072
Advanced Options
Timeout Configuration
The embedder request timeout (default: 30 seconds) can be increased for large files or slower models:
embedder: timeout: 60.0 # seconds
If you see timeout errors during indexing, try increasing this value. Very large notes with extensive JSON or code blocks may need 60-120 seconds.
Automatic Indexing
Linux (systemd)
Create a service and timer in ~/.config/systemd/user/:
obsidian-semantic-index.service
[Unit] Description=Index Obsidian vault for semantic search [Service] Type=oneshot EnvironmentFile=%h/.config/obsidian-semantic/env ExecStart=/home/youruser/.local/bin/obsidian-semantic index
obsidian-semantic-index.timer
[Unit] Description=Run Obsidian semantic index hourly [Timer] OnCalendar=hourly Persistent=true [Install] WantedBy=timers.target
The EnvironmentFile is optional — use it to store secrets like GEMINI_API_KEY outside of the main config.
Enable and start:
systemctl --user enable --now obsidian-semantic-index.timerMultiple vaults
To index additional vaults, add more ExecStart lines to the service (they run sequentially):
[Service] Type=oneshot EnvironmentFile=%h/.config/obsidian-semantic/env ExecStart=/home/youruser/.local/bin/obsidian-semantic index ExecStart=/home/youruser/.local/bin/obsidian-semantic index --vault /path/to/second-vault
macOS (launchd)
A ready-to-edit plist + wrapper script lives in scripts/launchd/. The wrapper opportunistically starts the LM Studio server (lms server start) before each run, so the agent works whether or not you remembered to leave the server up.
Install once:
# Make obsidian-semantic available on PATH uv tool install -e . # Edit the absolute paths in the plist to match your home directory, then: cp scripts/launchd/com.ravila.obsidian-semantic-index.plist ~/Library/LaunchAgents/ launchctl load -w ~/Library/LaunchAgents/com.ravila.obsidian-semantic-index.plist
Logs land at ~/Library/Logs/obsidian-semantic-index.log.
To unload or check status:
launchctl list | grep obsidian-semantic launchctl unload ~/Library/LaunchAgents/com.ravila.obsidian-semantic-index.plist