Scale Python across 1,000 CPUs or GPUs in 1 second.
Burla is a high-performance parallel processing library with an extremely fast developer experience. Scale batch processing, vector embeddings, inference, or build pipelines with dynamic hardware.
Burla is open-source and designed to run in your private cloud. It only has one function:
from burla import remote_parallel_map my_inputs = list(range(1000)) def my_function(x): print(f"[#{x}] running on separate computer") remote_parallel_map(my_function, my_inputs)
This runs my_function on 1,000 VMs in under one second:
The cleanest way to build scalable data-pipelines.
Zero special syntax. Change containers, hardware, or cluster size automatically mid-workload.
Burla scales up to 10,000 CPUs in a single function call, supports GPUs, and any Docker container. Pipelines built with Burla are simpler, more maintainable, faster, and more fun to develop.
remote_parallel_map(process, [...], image="osgeo/gdal:latest") remote_parallel_map(aggregate, [...], func_cpu=64) remote_parallel_map(predict, [...], func_gpu="A100")
This creates a pipeline like:
Monitor progress in the dashboard
Cancel bad runs, filter logs to watch individual inputs, or monitor output files in the UI.
How it works
Develop like your laptop has 1,000 CPUs or GPUs. Remote development, local feel.
return_values = remote_parallel_map(my_function, my_inputs)
When functions are run with remote_parallel_map:
- Anything they print appears locally (and inside the dashboard).
- Any exceptions are thrown locally.
- Any packages or local modules are (very quickly) cloned on remote machines.
- Code starts running in under one second, even with millions of inputs or thousands of machines.
Features
-
📦 Automatic Package Sync
Burla automatically (and very quickly) clones your Python packages on every remote machine where code is executed. -
🐋 Custom Containers
Easily run code in any Docker container. Public or private, just paste an image URI in the settings, then hit start. -
📂 Network Filesystem
Need to get big data into or out of the cluster? Burla automatically mounts a cloud storage bucket to a folder in every container. -
⚙️ Variable Hardware Per-Function
Thefunc_cpuandfunc_ramargs make it possible to assign big hardware to some functions, and less to others.
Self-Host Burla Today
Burla can be self-hosted with just one command. For setup details, see the Getting Started guide.
Try Our Quickstart (1,000 CPUs)
- Sign in using your Google or Microsoft account.
- Run the quickstart in this Google Colab notebook. It takes less than 2 minutes.
Examples
- Process 2.4TB of Parquet files in 76s with 10,000 CPUs
- Hyperparameter tune XGBoost using 1,000 CPUs
- Genome alignments using 1,300 CPUs
Learn more at docs.burla.dev.
Questions? Schedule a call, or email jake@burla.dev.


