AI sovereignty
for the JVM
The JVM powers global finance, big data, and mission-critical infrastructure. It deserves an AI stack built to the same standard.
Experimental ยท Evolving
Multi-backend tensor engine. Panama, C, CUDA, HIP, Metal, OpenCL, and Mojo. Write once, accelerate everywhere.
Pure Java read/write for llama.cpp's GGUF model format. No native bindings required.
Pure Java read/write for HuggingFace's Safetensors format. Memory-mapped for large models.
Pure Java, efficient, TikToken-compatible + customizable BPE tokenizers for popular LLM models.
Pure Java Core
Inference runs on any JVM out of the box. Backends for CUDA, Metal, and others give you hardware acceleration when you need it.
On-Device LLM Inference
Run large language models locally with quantization and efficient memory management.
Write Once, Accelerate Everywhere
A single Tensor API across Panama, C, CUDA, HIP, Metal, OpenCL, and Mojo. Switch backends with one line.
Vector Embeddings
Fast vector operations for RAG pipelines and semantic search.
GraalVM Native Image
First-class support for Native Image. Small footprint, fast startup.
JVM-Native Architecture
No Python interop, no ONNX bridges. An AI stack built from first principles for the JVM.
Connect
Get involved
Quixotic AI is building in the open. Give it a spin, share feedback, contribute.