RunMat fuses back-to-back ops into fewer GPU steps and keeps arrays on device. MATLAB syntax. No kernel code, no rewrites.
Syntax you already know.
Write in MATLAB, and RunMat runs your computation automatically across CPU and GPUs for maximum speed. No CUDA, no kernel code.
x = 0:0.001:4*pi; % 0 to 4π in steps of 0.001
y = sin(x) .* exp(-x/10); % regular MATLAB language mathWhy it's Fast: GPU Fusion & Residency
RunMat fuses sequential operations into fewer computational steps and keeps arrays on device between steps ("residency"). That means less memory traffic and fewer GPU program launches, so your scripts finish sooner.
Real Workloads, Reproducible Results
Benchmarked on an Apple M2 Max, 32GB. Times are wall-clock milliseconds for each configuration.
4K image pipeline: per-image mean/std, normalization, gain/bias, gamma, and MSE.
Monte Carlo: geometric Brownian motion with terminal PnL and risk stats.
Elementwise math: long chain of sin, exp, cos, and tanh operations on big 1D arrays.
Each number is the median of 3 runs. Full scripts live in the benchmarks folder.
Why Use RunMat?
From Fusion to Residency and VM OpCodes designed for executing math fast, RunMat is optimized for math, not general purpose programming.
Single binary with consistent performance on macOS/Windows/Linux and headless servers. Use Mac Metal GPUs, NVDIA/AMD GPUs, or ARM Vulkan GPUs.
Same code, everywhere: static binaries with consistent performance on macOS/Windows/Linux and headless servers. GPU portability via Metal, DirectX 12, and Vulkan—no CUDA lock-in. Great for laptops, clusters, and CI.
Free and Open Source
Copy and paste the command below to get started with RunMat.
curl -fsSL https://runmat.org/install.sh | sh
iwr https://runmat.org/install.ps1 | iex