Local-first AI finance categorizer. Your bank data never leaves your machine.
Upload any bank statement CSV and NumbyAI automatically detects the format, categorizes every transaction using a local LLM (Ollama), and surfaces patterns across months. No cloud. No subscriptions. No data sharing.
Demo
NumbyAI Demo: Watch on YouTube →
Upload & AI Categorization
Drop a bank statement CSV and watch NumbyAI auto-detect columns and categorize every transaction using a local LLM.
Dashboard
Track spending by category, monitor budgets, and analyze cash flow trends — all from a single view.
Rule Advisor
The Rule Advisor analyzes your categorization patterns and suggests reusable rules, making future uploads instant.
What it does
- Drop any CSV — the heuristic parser detects metadata rows, column layout, date format, currency, and number format automatically, without you mapping anything. Falls back to the LLM when confidence is low.
- Rule engine runs first — saved patterns (regex, bank-specific, amount filters) categorize known transactions instantly, without touching the LLM.
- LLM handles the rest — remaining transactions are batched and sent to Ollama in parallel workers. Confident results are committed; ambiguous ones go to the review queue.
- Review queue — flag-and-resolve UI with bulk select, conflict detection, and one-click rule creation from any pattern.
- Rule analysis — analyzes all your transactions for inconsistencies and suggests new rules to clean up historical data.
- Dashboard — category breakdowns, month-over-month trends, cash flow, budget vs actual.
Statement Parser — Technical Overview
Most tools require you to manually map columns. NumbyAI's heuristic engine handles the messy reality of real-world bank exports:
What gets auto-detected
| Signal | How |
|---|---|
| Metadata preamble rows | Scans from top, finds first row with both a date and a numeric value — everything above is skipped |
| Column roles | Scores each column independently (date density, numeric density, text length, emptiness) |
| Inflow/Outflow split | Detects adjacent complementary numeric columns (one empty when the other isn't) — common in UK/EU exports |
| Date format | Pattern-matches against 9 formats: YYYY-MM-DD, DD/MM/YYYY, MM/DD/YYYY, DD.MM.YYYY, DD Mon YYYY, short-year variants |
| Number format | Distinguishes EU (1.234,56) from US (1,234.56) by counting comma/dot separator signals |
| Currency | Detects from symbols ($€£¥₹₽₩) and ISO codes (USD EUR GBP PLN CHF etc.) in headers and data cells |
| Balance column | Identifies monotonically-signed numeric columns near the amount column |
Supported bank formats (tested)
| Bank / Format | Country | Notes |
|---|---|---|
| Chase | 🇺🇸 US | 7-column with Post Date and Category |
| Bank of America | 🇺🇸 US | 4-column, running balance |
| Wells Fargo | 🇺🇸 US | No header row |
| Barclays | 🇬🇧 UK | Inflow/outflow split columns |
| HSBC | 🇬🇧 UK | Metadata preamble, separate debit/credit |
| ING | 🇳🇱 NL | Semicolon-delimited, EU number format |
| Sparkasse | 🇩🇪 DE | Semicolon-delimited, EU decimals, multi-row metadata |
| UBS | 🇨🇭 CH | CHF currency detection, preamble rows |
| BNP Paribas | 🇫🇷 FR | Semicolon-delimited, signed amounts |
| NAB | 🇦🇺 AU | AUD, debit/credit columns |
| Santander | 🇬🇧 UK / 🇪🇸 ES | Multiple regional formats |
| Revolut | 🌍 Multi | Multi-currency exports |
| Tab-delimited | Any | Auto-detected |
| Pipe-delimited | Any | Auto-detected |
| Generic w/ metadata | Any | Account info header rows auto-skipped |
When heuristic confidence is low, the LLM is called with a structured prompt and the first 15 rows to fill the gaps. Heuristic results always win when confident.
Categorization Pipeline
Upload CSV
│
▼
Statement Analyzer
├─ Heuristic engine ──── high confidence ──▶ column mapping resolved
└─ LLM fallback ──── low confidence ──▶ LLM fills gaps
│
▼
Rule Engine ◀─── saved preferences (regex patterns, bank filters)
├─ Match found ──▶ category applied instantly
└─ No match ──▶ LLM batch queue
│
▼
Ollama (parallel workers, configurable batch size)
├─ Confident result ──▶ category committed
└─ Uncertain ──▶ MANUAL_REVIEW flag
│
▼
Review Queue
├─ Bulk select + assign category
├─ Per-transaction rule creation
└─ Conflict resolution (AI vs reviewer)
│
▼
Dashboard
Features
- Zero-config format detection — works on statements with metadata headers, blank rows, split debit/credit columns, EU/US number formats, and 9 date format variants
- Multi-currency — detects and stores transaction currency; dashboard handles mixed-currency months
- Parallel LLM batching — configurable worker count and batch size; processes large statements fast
- Rule analysis — post-hoc analysis finds categorization conflicts and suggests new rules across historical data
- Bulk review UI — checkbox select-all, bulk categorize, inline conflict resolution
- Budget tracking — set monthly budgets per category, visualized against actuals
- Multi-bank — each upload is tagged to a bank; rules can be bank-specific or global
- Auth optional — runs in single-user mode with no auth required; plug in Auth0 for multi-user
- SQLite (dev) / PostgreSQL (prod) — swap via
DATABASE_URL - Privacy first — no telemetry, no external API calls, runs entirely on your machine
Architecture
┌──────────────────────────────────────────────────────┐
│ Browser (:8000) │
│ ┌────────────────┐ ┌─────────────────────────┐ │
│ │ Upload Wizard │ │ Dashboard │ │
│ │ (SimpleUpload) │ │ Charts · Budgets · │ │
│ │ Auto-detection │ │ Review · Trends │ │
│ └────────────────┘ └─────────────────────────┘ │
└──────────────────┬───────────────────────────────────┘
│ REST + SSE (streaming)
┌──────────────────▼───────────────────────────────────┐
│ FastAPI Server (:8000) │
│ ┌─────────────┐ ┌──────────┐ ┌────────────────────┐ │
│ │ Statement │ │ Rule │ │ Ollama LLM │ │
│ │ Analyzer │ │ Engine │ │ (parallel batches)│ │
│ └─────────────┘ └──────────┘ └────────────────────┘ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ SQLite (dev) / PostgreSQL (prod) │ │
│ └─────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────┘
│
┌──────────────────▼───────────────────────────────────┐
│ Ollama (:11434) │
│ Local LLM — default: qwen3.5:9b │
└──────────────────────────────────────────────────────┘
Categories
Income · Housing & Utilities · Food & Groceries · Transportation · Insurance · Healthcare · Shopping · Entertainment · Travel · Debt Payments · Internal Transfers · Investments · Other
Prerequisites
| Dependency | Version | Notes |
|---|---|---|
| Python | 3.11+ | Backend runtime |
| Node.js | 18+ | Frontend build |
| Ollama | Latest | Local LLM inference |
Quick Start
Works on Windows, macOS, and Linux. The only prerequisites are Python 3.11+, Node.js 18+, and Ollama.
# 1. Clone git clone https://github.com/RoXsaita/NumbyAI-Public.git cd NumbyAI-Public # 2. Install Ollama and pull the default model python run.py setup-ollama # 3. Copy env file cp server/.env.example server/.env # macOS / Linux copy server\.env.example server\.env # Windows (cmd) # 4. Start everything (venv, deps, migrations, frontend build, server) python run.py start
App runs at http://localhost:8000.
Upload the included sample_bank_export.csv — it's a realistic two-month export with metadata header rows, recurring merchants, and edge cases designed to exercise the parser.
Project Structure
NumbyAI-Public/
├── server/
│ ├── app/
│ │ ├── main.py # API routes
│ │ ├── config.py # Pydantic settings
│ │ ├── database.py # SQLAlchemy models
│ │ ├── services/
│ │ │ ├── statement_analyzer.py # Heuristic format detection + LLM fallback
│ │ │ ├── categorization_rules.py
│ │ │ ├── llm_service.py # Ollama client + batching
│ │ │ └── ollama_service.py
│ │ └── tools/
│ │ └── statement_parser.py # CSV/XLSX → transaction rows
│ ├── tests/
│ │ ├── fixtures/ # Real-world format CSVs (Chase, Barclays, ING, ...)
│ │ └── test_statement_analyzer.py
│ ├── alembic/ # DB migrations
│ └── Dockerfile
├── web/
│ └── src/
│ ├── components/SimpleUpload.tsx # Upload wizard
│ ├── widgets/dashboard.tsx # Main dashboard
│ └── lib/api-client.ts
├── sample_bank_export.csv # Two-month test statement with metadata preamble
├── run.py # Cross-platform CLI (Windows / macOS / Linux)
└── Makefile # macOS / Linux shortcut (optional)
Configuration
All config via environment variables. See server/.env.example.
| Variable | Description | Default |
|---|---|---|
DATABASE_URL |
DB connection string | sqlite:///./finance_recon.db |
SECRET_KEY |
JWT signing key | dev-only-not-for-production |
OLLAMA_URL |
Ollama server URL | http://localhost:11434 |
OLLAMA_MODEL |
Model for categorization | qwen3.5:9b |
CATEGORIZATION_BATCH_SIZE |
Transactions per LLM batch | 20 |
CATEGORIZATION_MAX_WORKERS |
Parallel batch workers | 2 |
AUTH0_DOMAIN |
Auth0 domain (optional) | Disabled |
Development
All commands work on Windows, macOS, and Linux via run.py:
python run.py start # Stop → migrate → build → start python run.py stop # Kill the server python run.py logs # Tail backend logs python run.py check # ruff + mypy + pytest python run.py setup-ollama # Install/verify Ollama + pull model python run.py test-e2e # End-to-end categorization (requires Ollama) python run.py clear-db # Delete the SQLite database
macOS / Linux shortcut (Makefile)
If you have make installed, the Makefile still works:
make restart # Stop → migrate → build → start make stop # Kill all services make logs # Tail backend logs make check-python # ruff + mypy + pytest make test-e2e # End-to-end categorization (requires Ollama)
Run tests
cd server
pytest tests --cov=app --cov-report=term-missingFrontend
cd web && npm install && npm run build # Dev mode with mock data (no backend needed): npm run build:dev
No separate dev server — FastAPI serves the built frontend as static files.
Platform Notes
Windows
- Use
pythoninstead ofpython3(Windows Python installer registerspython). - Ollama: install from ollama.com/download/windows or
winget install Ollama.Ollama. - The
Makefilerequires GNU Make (e.g. via Git Bash or WSL) — userun.pyinstead.
Linux
- Ollama:
curl -fsSL https://ollama.com/install.sh | sh. - Everything else works out of the box.
macOS
- Ollama:
brew install ollama. - Both
run.pyandmakework.
Deployment
Docker Compose (quickest)
# 1. Build the frontend first cd web && npm install && npm run build && cd .. # 2. Start the app + Ollama docker-compose up # 3. Pull the model inside the Ollama container (first run only) docker-compose exec ollama ollama pull qwen3.5:9b
App runs at http://localhost:8000. Data is persisted in Docker volumes (sqlite_data, ollama_data).
Production (PostgreSQL)
A Dockerfile and railway.toml are included. For production:
- Set
DATABASE_URLto a PostgreSQL connection string - Set
SECRET_KEYto a secure random value - Set
ENVIRONMENT=production - Point
OLLAMA_URLat your Ollama instance


