TurboPrefill is an attempt to make layer-split multi-GPU configurations spend less time waiting and more time computing during prefill.
TurboPrefill is an attempt to make layer-split multi-GPU configurations spend less time waiting and more time computing during prefill.