Natural-language analytics for BigQuery, powered by Google’s ADK and a streaming Next.js experience.
Selecta ships two tightly-integrated pieces:
- Backend (
backend/) – a Google ADK agent that parses English questions, generates GoogleSQL, executes BigQuery jobs, and streams structured results (rows, charts, summaries, meta). - Frontend (
frontend/) – a Shadcn-flavoured Next.js 14 client that renders the conversation, reasoning trace, data visualisation, SQL, and a Meta tab with execution telemetry.
Together they deliver a “Gemini + BigQuery” workflow that feels immediate: ask a question, watch the reasoning stream, see the Vega-Lite chart materialise, and browse job details without reloading.
Highlights
- Structured answers – every response arrives with
Summary,Results, andBusiness Insightssections plus a generated Vega-Lite spec. - Meta-aware streaming – the UI separates reasoning from the final answer and exposes execution metrics (job id, latency, dataset provenance) in a dedicated tab.
- Drop-in backend – runs on top of the official ADK HTTP server; no custom FastAPI layer or bespoke protocol to maintain.
- Modern frontend – Next.js App Router, TypeScript, Zustand, Shadcn UI, and Tailwind 4 powering a responsive glassmorphic layout.
Architecture at a glance
[User] → [Next.js Frontend (SSE client)]
│
▼
[Google ADK HTTP Server]
│
▼
[BigQuery + Gemini]
- The frontend posts a question to
POST /run_sse. - The ADK agent streams intermediate reasoning followed by a structured payload:
- rows + columns + row count
- Vega-Lite chart suggestion
- markdown summary/results/insights
- BigQuery job metadata (
executionMs,jobId, dataset info)
- The frontend normalises those increments into the chat bubble, a hidden “View reasoning” accordion, and the Chart / Table / SQL / Meta tabs.
See backend/api-contract.md for the exact event schema.
Getting started
Prerequisites
- Python 3.12+
uvfor backend dependency management- Node.js 22+ and npm for the frontend
- A Google Cloud project with BigQuery access (the repo ships with the public
thelook_ecommercedataset configured) - A Gemini API key (or Vertex AI credentials) for ADK
1. Boot the backend
cd backend uv venv source .venv/bin/activate uv pip install -e . # Configure credentials export GOOGLE_API_KEY="YOUR_GEMINI_KEY" # or set Vertex AI env vars export SELECTA_DATASET_CONFIG="$(pwd)/selecta/datasets/thelook.yaml" # Run the ADK server uv run adk api_server app --allow_origins "*" --port 8080
2. Start the frontend
cd frontend
npm install
npm run devVisit http://localhost:3000 and ask a question like “What are the top products by revenue in the last 150 days?”.
Deploy to Cloud Run
Use the helper script to build the container and roll out the ADK + Next.js stack:
./deploy-cloud-run.sh \ --project PROJECT_ID \ --region us-central1 \ --service selecta \ --service-account PROJECT_ID-compute@developer.gserviceaccount.com \ --secret GOOGLE_API_KEY=selecta-gemini-key:latest \ --allow-unauthenticated
- The script mirrors the verified manual command. The ADK backend listens on
8081, the Next.js frontend on8080, and the default dataset config comes from/opt/venv/lib/python3.12/site-packages/selecta/datasets/thelook.yamlinside the image. - Ensure the chosen service account has
roles/secretmanager.secretAccessoron the Gemini key secret (or pass--env GOOGLE_API_KEY=...if you prefer a literal env var). - To swap datasets, include
--env SELECTA_DATASET_CONFIG=/app/path/to/custom.yaml.
Once deployed, the service URL printed by gcloud (or the script) becomes the base for the frontend; no additional proxy configuration is required.
Project layout
backend/ # Google ADK agent, dataset config, tooling
frontend/ # Next.js + Shadcn UI client
roadmap.md # Upcoming workstreams
- Backend docs –
backend/README.md(setup, configuration, payload reference) - API contract –
backend/api-contract.md(endpoints + streaming schema) - Frontend docs –
frontend/README.md(app structure, dev commands)
Result metadata
Every streamed result includes:
| Field | Description |
|---|---|
summary, resultsMarkdown, businessInsights |
Markdown sections rendered in the chat bubble. |
rows, columns, rowCount |
Lightweight preview of the BigQuery result set. |
chart |
Vega-Lite spec used by the Chart tab. |
executionMs, jobId |
BigQuery latency + job identifier (shown in the Meta tab). |
dataset |
Active dataset descriptor (project, billing project, location, table allowlist). |
Use these fields to build your own client, persist history, or trigger follow-up analyses.
Contributing
We welcome issues and pull requests! A few guidelines:
- Keep docs and contracts up to date (
README.md,backend/README.md,backend/api-contract.md,frontend/README.md). - When changing backend payloads, update the frontend store/hooks and Meta tab.
- Run
npm run lint(frontend) anduv pip check/python -m compileall(backend) before opening a PR.
See roadmap.md for active workstreams and ideas.
License
Apache 2.0 – see LICENSE.
