JQ-By-Example
AI-Powered JQ Filter Synthesis Tool
JQ-By-Example automatically generates jq filter expressions from input/output JSON examples using LLM-powered synthesis with iterative refinement.
Overview
JQ-By-Example solves a common developer problem: you know what JSON transformation you want, but writing the correct jq filter is tricky. Simply provide example input/output pairs, and JQ-By-Example will synthesize the filter for you.
Key Features:
- π€ LLM-Powered Generation - Uses OpenAI, Anthropic, or compatible APIs to generate filter candidates
- π Iterative Refinement - Automatically improves filters based on algorithmic feedback
- β Verified Correctness - Executes filters against real jq binary to verify outputs
- π Detailed Diagnostics - Classifies errors (syntax, shape, missing keys, order) with partial scoring
- π‘οΈ Safe Execution - Sandboxed jq execution with timeout and output limits
- π Production-Ready - Comprehensive edge case handling, security auditing, structured logging
Installation
Prerequisites
- Python 3.10 or higher
- jq binary installed and available in PATH:
# macOS brew install jq # Ubuntu/Debian sudo apt-get install jq # Windows (with chocolatey) choco install jq
Install JQ-By-Example
git clone https://github.com/nulone/jq-by-example.git cd jq-by-example python3 -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate pip install -e .
Quick Start
Interactive Mode
Synthesize a filter from a single input/output example:
jq-by-example \ --input '{"user": {"name": "Alice", "age": 30}}' \ --output '"Alice"' \ --desc "Extract the user's name"
Output:
============================================================
[1/1] Solving: interactive
Description: Extract the user's name
Examples: 1
Max iterations: 10
============================================================
β Task: interactive
Filter: .user.name
Score: 1.000
Iterations: 1
Time: 2.34s
============================================================
OVERALL SUMMARY
============================================================
Tasks: 1/1 passed (100.0%)
Total time: 2.34s
Average time per task: 2.34s
============================================================
Batch Mode
Run predefined tasks from a file:
# Run a specific task jq-by-example --task nested-field # Run all tasks jq-by-example --task all # With verbose output (shows iteration details) jq-by-example --task all --verbose
CLI Options
usage: jq-by-example [-h] [-t TASK] [--tasks-file TASKS_FILE] [--max-iters MAX_ITERS]
[--baseline] [-i INPUT] [-o OUTPUT] [-d DESC]
[--provider {openai,anthropic}] [--model MODEL] [--base-url BASE_URL]
[-v] [--debug]
AI-Powered JQ Filter Synthesis Tool
options:
-h, --help Show this help message and exit
Task Selection:
-t TASK, --task TASK Task ID to run, or 'all' to run all tasks
--tasks-file TASKS_FILE
Path to tasks JSON file (default: data/tasks.json)
Iteration Control:
--max-iters MAX_ITERS
Maximum iterations per task (default: 10)
--baseline Single-shot mode (max_iterations=1, no refinement)
Interactive Mode:
-i INPUT, --input INPUT
Input JSON for interactive mode
-o OUTPUT, --output OUTPUT
Expected output JSON for interactive mode
-d DESC, --desc DESC Task description for interactive mode
LLM Provider:
--provider {openai,anthropic}
LLM provider type (default: from LLM_PROVIDER env or 'openai')
--model MODEL Model identifier (default: from LLM_MODEL env or provider default)
--base-url BASE_URL Base URL for OpenAI-compatible providers (default: from LLM_BASE_URL env)
Output Control:
-v, --verbose Enable verbose output (shows iteration details)
--debug Enable debug logging (shows detailed internal state)
Usage Examples
# Interactive mode - simple field extraction jq-by-example -i '{"x": 42}' -o '42' -d 'Extract x' # Interactive mode - array filtering jq-by-example -i '[1,2,3,4,5]' -o '[2,4]' -d 'Keep only even numbers' # Interactive mode - nested object access jq-by-example \ -i '{"data": {"users": [{"name": "Alice"}]}}' \ -o '["Alice"]' \ -d 'Extract all user names' # Batch mode - run specific task jq-by-example --task nested-field # Batch mode - all tasks with verbose output jq-by-example --task all --verbose # Single-shot mode (no refinement) for baseline comparison jq-by-example --task nested-field --baseline # Custom tasks file jq-by-example --task my-task --tasks-file my-tasks.json # Debug mode for troubleshooting jq-by-example --task nested-field --debug # Limit iterations jq-by-example --task filter-active --max-iters 5 # Use Anthropic provider jq-by-example --provider anthropic --task nested-field # Use specific model jq-by-example --model gpt-4o-mini --task nested-field # Use OpenRouter jq-by-example --base-url https://openrouter.ai/api/v1 --model anthropic/claude-3.5-sonnet --task nested-field # Use local Ollama jq-by-example --base-url http://localhost:11434/v1 --model llama3 --task nested-field
How It Works
JQ-By-Example uses a deterministic oracle approach:
- Generation: An LLM (GPT-4, Claude, or compatible model) generates candidate jq filters based on your examples and description
- Verification: Each filter is executed against the real jq binary with your input examples
- Scoring: A deterministic algorithm compares actual vs expected outputs, computing similarity scores (0.0 to 1.0)
- Feedback: The algorithm classifies errors (syntax, shape, missing/extra elements, order) and generates actionable feedback
- Refinement: The LLM receives the feedback and generates an improved filter
- Iteration: Steps 2-5 repeat until a perfect match is found or limits are reached
This hybrid approach combines LLM creativity with deterministic verification, ensuring correctness while leveraging AI for filter synthesis.
Architecture
JQ-By-Example follows a modular architecture with clear separation of concerns:
ββββββββββββ
β CLI β Entry point, argument parsing, output formatting
ββββββ¬ββββββ
β
βΌ
ββββββββββββββββββ
β Orchestrator β Manages synthesis loop, tracks progress
βββ¬βββββββββββ¬ββββ
β β
βΌ βΌ
ββββββββββββ ββββββββββββ
βGenerator β β Reviewer β Filter evaluation & scoring
β(LLM) β ββββββ¬ββββββ
ββββββββββββ β
βΌ
ββββββββββββ
β Executor β Sandboxed jq execution
ββββββββββββ
Components
1. CLI (src/cli.py)
- Parses command-line arguments
- Loads tasks from JSON files
- Formats and displays results with progress indicators
- Tracks timing and generates summaries
2. Orchestrator (src/orchestrator.py)
- Manages the iterative refinement loop
- Coordinates between Generator and Reviewer
- Implements anti-stuck protocols:
- Duplicate filter detection (normalized)
- Stagnation detection (no improvement for N iterations)
- Max iteration limit
- Tracks best solution and complete history
3. Generator (src/generator.py)
- Interfaces with LLM providers (OpenAI, Anthropic, or compatible APIs)
- Builds prompts with task description, examples, and feedback history
- Extracts clean filter code from LLM responses
- Implements retry logic with exponential backoff
- Includes security features (API key never logged, input truncation)
4. Reviewer (src/reviewer.py)
- Evaluates generated filters against examples
- Computes similarity scores using:
- Jaccard similarity for lists
- Key/value matching for objects
- Exact matching for scalars
- Classifies errors by priority (SYNTAX β SHAPE β MISSING_EXTRA β ORDER)
- Generates actionable feedback for refinement
5. Executor (src/executor.py)
- Safely executes jq binary in subprocess
- Enforces resource limits (timeout, output size)
- Prevents shell injection (uses argument list, not shell)
- Handles jq errors and timeouts gracefully
6. Domain (src/domain.py)
- Defines core data structures (Task, Example, Attempt, Solution)
- Uses frozen dataclasses for immutability
- Type-safe with full type hints
Data Flow
- User provides task (JSON examples + description) via CLI
- CLI loads/validates task, initializes components
- Orchestrator starts synthesis loop:
- Iteration 1: Calls Generator with task only
- Generator queries LLM API for filter candidate
- Reviewer evaluates filter using Executor
- Executor runs jq binary with filter on examples
- Reviewer computes scores and generates feedback
- Iteration 2+: Generator receives history/feedback
- Loop continues until perfect match or limits reached
- Orchestrator returns Solution with best filter, score, history
- CLI displays formatted results with timing information
Error Classification
The reviewer classifies errors by priority (highest to lowest):
| Error Type | Description | Example | Score |
|---|---|---|---|
SYNTAX |
Invalid jq filter syntax | invalid[[[ |
0.0 |
SHAPE |
Wrong output type | Expected [], got {} |
0.0 |
MISSING_EXTRA |
Missing or extra elements/keys | Expected [1,2,3], got [1,2] |
0.67 (Jaccard) |
ORDER |
Correct elements, wrong order | Expected [1,2,3], got [3,2,1] |
0.8 |
NONE |
Perfect match | - | 1.0 |
Scoring Algorithm
- Lists: Jaccard similarity =
|intersection| / |union|- Special case: Correct elements, wrong order = 0.8
- Dicts:
(key_similarity + value_match_ratio) / 2 - Scalars: Binary (1.0 for exact match, 0.0 for mismatch)
- Multiple examples: Arithmetic mean of scores
Supported jq Patterns
JQ-By-Example works well with these common jq operations:
- Field extraction:
.foo,.user.name,.data.items[0] - Array operations:
.[],.[0],.[1:3],.[-1] - Filtering:
select(.active == true),select(.age > 18) - Mapping:
map(.name),[.[] | .id] - Array construction:
[.items[].name] - Object construction:
{name: .user.name, email: .user.email} - Conditionals:
if .status == "active" then .name else null end - Null handling:
select(. != null),.field // "default" - String operations: String interpolation, concatenation
- Arithmetic: Addition, subtraction, comparison operators
- Type checking:
type,length
Known Limitations
JQ-By-Example may struggle with these advanced jq features:
- Aggregations:
group_by(),reduce,min_by(),max_by() - Complex recursion:
recurse(),walk() - Variable bindings: Complex
as $varpatterns - Custom functions:
defstatements (blocked for security) - Advanced array operations:
combinations(),transpose() - Path manipulation:
getpath(),setpath(),delpaths() - Format strings:
@csv,@json,@base64
For these cases, you may need to write the filter manually or break down the task into simpler steps.
Model recommendations
| Task complexity | Recommended model | Speed |
|---|---|---|
| Simple filters (extract, select) | GPT-4o-mini, Claude Haiku | Fast |
| Medium (grouping, aggregation, recursion) | Claude Sonnet, GPT-4o | Fast |
| Complex algorithms (graph traversal, sorting) | DeepSeek R1 | Slow (minutes) |
Note: DeepSeek R1 solved topological sort and Dijkstra's shortest path in jq. Most users won't need this β standard models handle 95%+ of real-world tasks.
Supported Providers
| Provider | Status | Note |
|---|---|---|
| OpenAI | Stable β | Default provider |
| Anthropic | Beta |
Different API format |
| OpenRouter | Tested β | OpenAI-compatible |
| Ollama | Alpha π§ͺ | Local only, requires setup |
Note: OpenAI is default and most tested. Others should work but report issues if found.
Provider Setup
OpenAI (Default)
export OPENAI_API_KEY='sk-...' # Optional: specify model (default: gpt-4o) export LLM_MODEL='gpt-4o'
Anthropic
export LLM_PROVIDER='anthropic' export ANTHROPIC_API_KEY='sk-ant-...' # Optional: specify model (default: claude-sonnet-4-20250514) export LLM_MODEL='claude-sonnet-4-20250514'
OpenRouter
export LLM_BASE_URL='https://openrouter.ai/api/v1' export OPENAI_API_KEY="$OPENROUTER_API_KEY" export LLM_MODEL='anthropic/claude-3.5-sonnet'
Note: Set
OPENROUTER_API_KEYenvironment variable with your OpenRouter API key before running.
Local (Ollama)
export LLM_BASE_URL='http://localhost:11434/v1' export LLM_MODEL='llama3' export OPENAI_API_KEY='dummy' # Ollama doesn't require a real key
Together AI / Groq
# Together AI export LLM_BASE_URL='https://api.together.xyz/v1' export OPENAI_API_KEY='...' # Groq export LLM_BASE_URL='https://api.groq.com/openai/v1' export OPENAI_API_KEY='gsk_...'
Task File Format
Tasks are defined in JSON format:
{
"tasks": [
{
"id": "nested-field",
"description": "Extract the user's name from a nested object structure",
"examples": [
{
"input": {"user": {"name": "Alice", "age": 30}},
"expected_output": "Alice"
},
{
"input": {"user": {"name": "Bob", "email": "bob@example.com"}},
"expected_output": "Bob"
}
]
}
]
}Guidelines for Good Tasks
- Provide 3+ examples for better generalization
- Include edge cases: empty arrays, null values, missing fields
- Be specific in descriptions: "Extract user names" vs "Transform data"
- Use diverse inputs: different structures help the LLM understand the pattern
- Test edge cases: null, empty arrays/objects, deeply nested (3+ levels), special characters in keys
Built-in Tasks
The data/tasks.json file includes these example tasks:
| Task ID | Description | Difficulty | Expected Filter |
|---|---|---|---|
nested-field |
Extract .user.name |
Easy | .user.name |
filter-active |
Filter where active == true |
Medium | [.[] | select(.active == true)] |
extract-emails |
Extract emails, skip null/missing | Medium | [.[].email | select(. != null)] |
Troubleshooting
"jq binary not found"
Problem: JQ-By-Example can't locate the jq executable.
Solution: Ensure jq is installed and in your PATH:
# Check if jq is installed which jq # macOS brew install jq # Ubuntu/Debian sudo apt-get install jq # Verify installation jq --version
"API key required"
Problem: Missing API key environment variable.
Solution: Set the appropriate API key for your provider:
# For OpenAI export OPENAI_API_KEY='sk-...' # For Anthropic export ANTHROPIC_API_KEY='sk-ant-...' # Or use generic variable export LLM_API_KEY='...' # Permanent (add to ~/.bashrc or ~/.zshrc) echo 'export OPENAI_API_KEY="sk-..."' >> ~/.bashrc source ~/.bashrc
"API request failed: DNS resolution failed"
Problem: DNS resolution failed for the API endpoint.
Solution:
- Check your internet connection
- Verify the API endpoint is correct:
# For OpenAI curl -I https://api.openai.com/v1/chat/completions # For Anthropic curl -I https://api.anthropic.com/v1/messages
- If using a custom endpoint, check
LLM_BASE_URL:export LLM_BASE_URL='https://api.openai.com/v1'
"API request timed out"
Problem: API request has a 60-second timeout. Connection issues or server problems.
Solution:
- Check your internet connection
- Try again (transient network issues)
- Check your provider's service status
- Reduce task complexity (fewer examples, simpler description)
"Connection failed after 3 attempts"
Problem: Multiple retry attempts failed.
Solution:
- Verify API endpoint is reachable:
# For OpenAI curl https://api.openai.com/v1/chat/completions # For custom endpoint curl $LLM_BASE_URL/chat/completions
- Check your firewall/proxy settings
- Try with
--debugflag to see detailed error messages
Filter works in jq but not in JQ-By-Example
Problem: Your filter works when you run it manually with jq, but fails in JQ-By-Example.
Cause: JQ-By-Example uses these jq flags: -M (monochrome) and -c (compact output).
Solution: Ensure your expected output matches compact JSON format:
# Wrong: pretty-printed JSON { "name": "Alice" } # Correct: compact JSON {"name":"Alice"}
Low success rate or poor quality filters
Problem: Filters don't match expected outputs, or require many iterations.
Solution:
- Improve task description: Be specific about what transformation you want
- Add more examples: 3+ examples help the LLM generalize better
- Include edge cases: Empty arrays, null values, missing keys
- Simplify the task: Break complex transformations into smaller tasks
- Use verbose mode:
--verboseto see iteration details and understand failures
Debug mode for troubleshooting
Enable debug logging to see detailed internal state:
jq-by-example --task my-task --debug
Debug mode shows:
- Full API request/response details (with truncation for security)
- Detailed scoring calculations
- Duplicate filter detection
- Stagnation counter progression
Security
JQ-By-Example implements production-ready security measures:
API Key Protection
- API keys are never logged (even in debug mode)
- Stored securely in environment variables
- Transmitted only via HTTPS headers
Input Sanitization
- Large inputs are truncated in logs (max 100 characters)
- Prevents accidental exposure of sensitive data in log files
Shell Injection Prevention
- jq filters passed as subprocess arguments (not via shell)
- No use of
shell=Truein subprocess calls - Filters are never interpolated into shell commands
Resource Limits
- Timeout: 1 second per filter execution
- Max output: 1 MB per execution
- Prevents denial-of-service attacks and resource exhaustion
Edge Case Handling
Comprehensive test coverage for:
- Null input/output
- Empty arrays and objects
- Deeply nested structures (3+ levels)
- Special characters in keys (spaces, unicode, @, -)
- Large arrays (100+ items)
- Type mismatches and conversions
Development
Setup Development Environment
git clone https://github.com/nulone/jq-by-example.git cd jq-by-example python3 -m venv .venv source .venv/bin/activate pip install -e ".[dev]"
Running Tests
# Run unit tests (no API key required) pytest -m "not e2e" # Run all tests including E2E (requires API key) export OPENAI_API_KEY='your-key-here' # or export ANTHROPIC_API_KEY='your-key-here' pytest # Run with coverage pytest --cov=src --cov-report=html # Run specific test file pytest tests/test_generator.py -v
Code Quality
# Type checking mypy src # Linting ruff check src tests # Formatting ruff format src tests # Run all checks (recommended before commit) ruff check src tests && \ ruff format --check src tests && \ mypy src && \ pytest -m "not e2e"
Project Structure
jq-by-example/
βββ src/
β βββ cli.py # CLI entry point
β βββ orchestrator.py # Synthesis loop coordinator
β βββ generator.py # LLM-based filter generation
β βββ providers.py # LLM provider abstractions (OpenAI, Anthropic)
β βββ reviewer.py # Filter evaluation & scoring
β βββ executor.py # Safe jq execution
β βββ domain.py # Core data structures
β βββ security.py # Security utilities (log truncation)
βββ tests/
β βββ test_cli.py
β βββ test_orchestrator.py
β βββ test_generator.py
β βββ test_reviewer.py
β βββ test_executor.py
β βββ test_domain.py
β βββ test_edge_cases.py # Production-ready edge cases
β βββ test_e2e.py # End-to-end tests (require API key)
βββ data/
β βββ tasks.json # Example task definitions
βββ pyproject.toml # Project configuration
βββ README.md # This file
Contributing
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch:
git checkout -b feature/my-feature - Make your changes with tests
- Ensure all checks pass:
ruff check src tests ruff format --check src tests mypy src pytest -m "not e2e" - Commit with clear messages:
git commit -m "Add feature X" - Push to your fork:
git push origin feature/my-feature - Open a Pull Request
Code Style
- Type hints required for all public functions
- Docstrings required for all public functions and classes (Google style)
- 100 character line limit
- Follow existing patterns in codebase
- Add tests for all new features
- Security-first mindset (never log sensitive data)
License
MIT License - see LICENSE for details.
Acknowledgments
- jq - The excellent JSON processor by Stephen Dolan
- OpenAI - GPT models and API
- Anthropic - Claude models and API
JQ-By-Example - Because life's too short to debug jq filters manually.
