What Models? — Pick the right model for your GPU in seconds
Select your GPU
or VRAM (GB)
Minimum context window Any2K4K8K16K32K64K128K200K<br>Minimum tokens/sec Any5 tok/s10 tok/s20 tok/s30 tok/s50 tok/s100 tok/s<br>Required features Any
System RAM (optional) ? Enter your system RAM to enable offloading. Models can use system memory to extend context windows or run larger models at reduced speed. GB
Pick a GPU or enter VRAM to get started
Results<br>Select a GPU or enter your VRAM to see which models you can run.