Behnam on X: "Inanely Fast Local AI: 775 token per second! 🤯 I was able to run the new DiffusionGemma (full BF16 model) by @googlegemma on vLLM (fork by Red Hat) on Nvidia RTX 6000 Pro. It's blazing fast at short contexts, but gets slow very quickly. At 100k, TTFT is 22s!<br>■Leave a comment https://t.co/48SWaVTDK4" / X<br>Post
Log inSign up
Post
Behnam
@OrganicGPT
Inanely Fast Local AI: 775 token per second! 🤯 I was able to run the new DiffusionGemma (full BF16 model) by @googlegemma on vLLM (fork by Red Hat) on Nvidia RTX 6000 Pro. It's blazing fast at short contexts, but gets slow very quickly. At 100k, TTFT is 22s!<br>■Leave a comment setup and command to run the model.
span:not(:empty)~span:not(:empty)]:before:content-['·'] [&>span:not(:empty)~span:not(:empty)]:before:px-1 [&>span:not(:empty)~span:not(:empty)]:before:shrink-0">1:33 AM · Jun 11, 20261.9KViews
:host{display:inline-block;direction:ltr;white-space:nowrap;line-height:var(--number-flow-char-height, 1em) !important}span{display:inline-block}:host([data-will-change]) span{will-change:transform}.number,.digit{padding:round(nearest, calc(var(--number-flow-mask-height, 0.25em) / 2), 1px) 0}.symbol{white-space:pre}2number-flow-react > span{font-kerning:none;display:inline-block;line-height:var(--number-flow-char-height, 1em) !important;padding:calc(round(nearest, calc(var(--number-flow-mask-height, 0.25em) / 2), 1px) * 2) 0}2
:host{display:inline-block;direction:ltr;white-space:nowrap;line-height:var(--number-flow-char-height, 1em) !important}span{display:inline-block}:host([data-will-change]) span{will-change:transform}.number,.digit{padding:round(nearest, calc(var(--number-flow-mask-height, 0.25em) / 2), 1px) 0}.symbol{white-space:pre}19number-flow-react > span{font-kerning:none;display:inline-block;line-height:var(--number-flow-char-height, 1em) !important;padding:calc(round(nearest, calc(var(--number-flow-mask-height, 0.25em) / 2), 1px) * 2) 0}19<br>:host{display:inline-block;direction:ltr;white-space:nowrap;line-height:var(--number-flow-char-height, 1em) !important}span{display:inline-block}:host([data-will-change]) span{will-change:transform}.number,.digit{padding:round(nearest, calc(var(--number-flow-mask-height, 0.25em) / 2), 1px) 0}.symbol{white-space:pre}9number-flow-react > span{font-kerning:none;display:inline-block;line-height:var(--number-flow-char-height, 1em) !important;padding:calc(round(nearest, calc(var(--number-flow-mask-height, 0.25em) / 2), 1px) * 2) 0}9
Read 2 replies
*]:shrink-0">New to X?<br>Sign up now to get your own personalized timeline!<br>Sign up with GoogleSign up with AppleCreate account<br>By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.
Relevant people<br>Behnam@OrganicGPTFollow
Trending now
Don't miss what's happening<br>People on X are the first to know.
Log inSign up