Show HN: TurboPrefill – Multi-GPU prefill acceleration for llama.cpp

trykhlieb1 pts0 comments

TurboPrefill is an attempt to make layer-split multi-GPU configurations spend less time waiting and more time computing during prefill.

turboprefill multi prefill time show acceleration

Related Articles