Select your GPU
or
VRAM (GB)
Minimum context window
Minimum tokens/sec
Required features
System RAM (optional) ? Enter your system RAM to enable offloading. Models can use system memory to extend context windows or run larger models at reduced speed.
GB
Pick a GPU or enter VRAM to get started
Select a GPU or enter your VRAM to see which models you can run.