Florian Brand, Prime Intellect research engineer, adopts Gemma 4 E4B 6-bit quantized as his primary local Mac LLM · Digg<br>/AI7h ago
Florian Brand, Prime Intellect research engineer, adopts Gemma 4 E4B 6-bit quantized as his primary local Mac LLM
- The new setup replaces his nine-month daily Qwen deployment.
263991120431.3K<br>*]:-ml-2 flex max-w-full flex-wrap gap-y-2 z-0">svg]:pointer-events-none [&>svg]:size-3! hover:bg-muted hover:text-muted-foreground dark:hover:bg-muted/50 z-10 h-auto overflow-visible rounded-full p-0 gap-0 border-transparent transition-colors duration-150 ease-out relative bg-muted ring-2 ring-background" data-state="closed">#270👩💻P|@DYNAMICWEBPAIGE<br>svg]:pointer-events-none [&>svg]:size-3! hover:bg-muted hover:text-muted-foreground dark:hover:bg-muted/50 z-10 h-auto overflow-visible rounded-full p-0 gap-0 border-transparent bg-transparent transition-colors duration-150 ease-out absolute left-0 top-0" data-state="closed">KA#488|@YACINEMTB<br>svg]:pointer-events-none [&>svg]:size-3! hover:bg-muted hover:text-muted-foreground dark:hover:bg-muted/50 z-10 h-auto overflow-visible rounded-full p-0 gap-0 border-transparent bg-transparent transition-colors duration-150 ease-out absolute left-0 top-0" data-state="closed">FB#1117|@XEOPHON
Original post
Florian Brand@xeophon#1117in/AI
Gemma 4 E4B 6bit is now the local model of my choice and loaded 24/7 on my Mac (using @lmstudio), replacing Qwen3, 3.5 4B after ~9 months of usage<br>What an insane model, congrats @GoogleDeepMind 🤠
4:19 AM · Jun 7, 2026 · 28.7K Views
Sentiment
Many users praise Gemma 4 as the preferred local AI model on Mac for its strong speed and quality, such as 50 tokens per second on 16GB hardware and GPT-4o-like results.<br>Pos<br>92.9%
Neg<br>7.1%
15 comments with sentiment.View comments
Cluster Engagement
31.3K<br>Views<br>26<br>Comments<br>11<br>Reposts<br>204<br>Bookmarks
Expand data
Posts from X<br>Most ActivityMost ActivityTimeline
VIEWS3.5KBOOKMARKS6LIKES26<br>👩💻 Paige Bailey@DynamicWebPaige
💎 @googlegemma
Florian Brand@xeophon
Gemma 4 E4B 6bit is now the local model of my choice and loaded 24/7 on my Mac (using @lmstudio), replacing Qwen3, 3.5 4B after ~9 months of usage<br>What an insane model, congrats @GoogleDeepMind 🤠
4h|Views 3.5KLikes 26Bookmarks 6
RETWEETS2<br>Florian Brand@xeophon
Gemma 4 E4B 6bit is now the local model of my choice and loaded 24/7 on my Mac (using @lmstudio), replacing Qwen3, 3.5 4B after ~9 months of usage<br>What an insane model, congrats @GoogleDeepMind 🤠
7h|Views 28.7KLikes 384Bookmarks 208
REPLIES2<br>Lotto@LottoLabs
@xeophon @yacineMTB @lmstudio @GoogleDeepMind Wouldn’t qwen 9b be nicer?
3h|Views 487Likes 8
Igor Kotenkov@stalkermustang
@xeophon @lmstudio @GoogleDeepMind what are ur usecases? "rewrite", "summarize", "translate," or something bigger in scope and harder by nature?
6h|Views 274Likes 2
🧟@RaghavKoch19380
@xeophon @lmstudio @GoogleDeepMind Wouldn't the 4Bit QAT be better than a 6Bit PTQ
5h|Views 889
Florian Brand@xeophon
@RaghavKoch19380 @lmstudio @GoogleDeepMind The QAT are GGUF only afaik
5h|Views 821
Florian Brand@xeophon
@ignis_code @lmstudio @GoogleDeepMind M4 Max + 64 GB, model uses 7 GB
1h|Views 63Likes 2
🧟@RaghavKoch19380
@xeophon @lmstudio @GoogleDeepMind There are compressed tensor versions or something available for vLLM etc i think. check their huggingface QAT folder.
4h|Views 200Likes 1
Clemens Schartmüller@ClemensScharti
@xeophon @lmstudio @GoogleDeepMind what are you using it for?
4h|Views 624
Florian Brand@xeophon
@0xgeorge @yacineMTB @lmstudio @GoogleDeepMind License
1h|Views 170Likes 1
Vu@vu_zip
@xeophon @wambosec @lmstudio @GoogleDeepMind 64 gb and you use Gemma4 e4b ??? Bro at least use gemma4 12b
3h|Views 276
IGNIS@ignis_code
@xeophon @lmstudio @GoogleDeepMind 어느정도의 VRAM을 사용하시나요?
1h|Views 68Likes 1
Aaryan Kakad@aaryan_kakad
@xeophon @lmstudio @GoogleDeepMind yes, even i have one model always loaded on my system for assistance while building stuff or solving any problems.<br>i think people who can use small 4-9B models to build stuff can actually be called coders.
6h|Views 57Likes 1
George I@0xgeorge
@xeophon @yacineMTB @lmstudio @GoogleDeepMind Why not LFM 2.5 at 8bit for just an extra gb?
1h|Views 181
wambo.@wambosec
@xeophon @lmstudio @GoogleDeepMind mac specs?
7h|Views 167
Jeremy Nguyen ✍🏼 🚢@JeremyNguyenPhD
@xeophon @lmstudio @GoogleDeepMind Are you using it for the privacy considerations, Xeo?
6h|Views 122
Dan Greller@dgreller
@xeophon @lmstudio @GoogleDeepMind What context window are you using?
6h|Views 116
Lazarz@Laz4rz
@xeophon @lmstudio @GoogleDeepMind Why?
5h|Views 88
Florian Brand@xeophon
@wambosec @lmstudio @GoogleDeepMind M4 Max + 64 GB RAM
6h|Views 155Likes 3
Florian Brand@xeophon
@ClemensScharti @lmstudio @GoogleDeepMind https://florianbrand.com/posts/local-llms
3h|Views 756Likes 1
Load more posts
Digg Deeper<br>Ask Question<br>No Digg Deeper questions have been answered for this story yet.
/AI7h ago
Florian Brand,...