Gemini 3.5 Flash Looks Good For How Fast It Is
Don't Worry About the Vase
SubscribeSign in
Gemini 3.5 Flash Looks Good For How Fast It Is
Zvi Mowshowitz<br>May 22, 2026
Share
Google once again has a model worth at least some consideration. Gemini 3.5 Flash is likely the best model out there at its particular speed point, as long as you don’t mind that it is a Gemini model. So for cases where speed kills, this can be a reasonable choice. Otherwise, I don’t see signs you would want to use it over Opus 4.7 or GPT-5.5.<br>Google also had some other offerings for I/O Day, which this post will also cover.<br>Introducing Google Gemini 3.5 ‘Flash’
Google introduced Gemini 3.5 Flash, which it seems is for now their universal model until 3.5 Pro comes along. It is live in the usual places. It is a hybrid, where it has the speed of Flash but the cost is at least halfway to models like Opus and GPT-5.5.<br>Gemini 3.5 Pro is confirmed for next month.<br>They are focused on 3.5 Flash as a daily driver for agentic tasks. It has the advantage of being faster and cheaper than Claude Opus 4.7 or GPT-5.5, if it can do the job. Not as cheap as previous Flash models, though, this is basically a hybrid:
As always, this is presented as Google’s strongest model yet for all the things.<br>Jeff Dean: 1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action. We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows.<br>It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models.
Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale.
Here is their benchmark presentation:
Koray Kavukcuoglu: When coupled with the updated Antigravity harness, 3.5 Flash becomes a powerful engine for deploying collaborative subagents to tackle problems at scale for the most demanding use cases. Under supervision, it can reliably execute multi-step workflows and coding tasks while sustaining frontier performance.
There are some big improvements here, including GDPval where Gemini previously struggled. If those scores were representative of what this baby can do, and it’s a Flash model, then that would be quite the accomplishment.<br>The knowledge cutoff is January 2025, continuing Gemini’s pattern of not believing what year it is, which is bizarrely obsolete and a serious problem for many use cases.<br>It is not a true ‘flash’ model, given it costs substantially more than 3 Flash.<br>Pliny is there with the standard jailbreak.<br>The biggest hope is that this fills a niche of ‘good enough for agent work while being faster and cheaper.’<br>Conrad Barski: For those of us who are building our life around AI workflows (either because we like to do that, or just feel it is necessary for sheer survival in the near future) 3.5flash is a big step up:
I have dozens of personal utilities that don’t need SOTA intelligence, but are now much faster all of a sudden, at the same intelligence level: And since most of my utilities only need to do a modest number of llm calls to be useful, the increased cost of 3.5flash is not a factor.
The model can compete with codex5.5 “low effort”, but it is just so very very fast, far out of distribution compared other models. I assume openai will release a competitor soon, since cerebras is pretty optimal for this “medium IQ, high speed” use case.
Other People’s Benchmarks
A lot of benchmarks don’t have results, but of my usual suspects here is what we have.<br>The overall scores indicate only okay performance when adjusting for cost and price, and Gemini models tend to relatively overperform on benchmarks. One notices that Flash 3.5 does a lot worse on other people’s benchmarks than the ones Google lists.<br>It is catastrophically bad on You’re Absolutely Right, a sycophancy benchmark.<br>It did quite poorly on CursorBench.<br>It did not impress on WeirdML, only a small improvement on 3 Flash and far behind 3 Pro and 3.1 Pro.<br>It took the top spot on KnowsAboutBenBench, by the Ben in question.<br>It takes third place in Vals.ai on real world tasks.<br>It comes in at 9th in the Arena, slightly behind Gemini 3.1 Pro and 3 Pro.<br>It comes in at 55.3 on the AA Intelligence index, behind 57.2 for 3.1 Pro, 57.3 for Opus and 60.2 for GPT-5.5, while not being cheaper to run than 3.1 Pro on their test suite.<br>Reactions
Some people do like it.<br>davidad: It’s by far my favorite model at its price point, and also by far my favorite model at its speed. If by “back in the game”, you mean the game of having the best overall model, then obviously no not yet. But that’s hardly the only game.<br>Srivatsan Sampath: It has the benefits of Flash with less hallucinations? Really good spatial awareness (not as much of a token Hog for...