GitHub - ngafar/llama-scan: Transcribe PDFs with local LLMs

llama-scan

A tool for converting PDFs to text files using Ollama.

ollama run qwen2.5vl:latest

Install using pip:

or uv:

uv tool install llama-scan

Basic usage:

llama-scan path/to/your/file.pdf

--output, -o: Output directory (default: "output")
--model, -m: Ollama model to use (default: "qwen2.5vl:latest")
--start, -s: Start page number (default: 0)
--end, -e: End page number (default: 0)
--custom-instructions, -c: Optional path to a text file containing additional instructions (default: None)
--server-url, -u: Ollama server URL (default: "http://localhost:11434")
--width, -w: Width of the resized images (0 to skip resizing; default: 0)
--keep-images, -k: Keep the intermediate image files (default: False)
--stdout, -s: Write merged output to stdout (default: False)

Process specific pages:

llama-scan document.pdf --start 1 --end 5

Use a different Ollama model:

llama-scan document.pdf --model qwen2.5vl:3b