Zepto’s LLM Search Works Great. Unless You’re a New User. | by Abhishek jain | Jun, 2026 | MediumSitemapOpen in appSign up<br>Sign in
Medium Logo
Get app<br>Write
Search
Sign up<br>Sign in
Zepto’s LLM Search Works Great. Unless You’re a New User.
Abhishek jain
8 min read·<br>Just now
Listen
Share
Press enter or click to view image in full size
Originally published on geekabhi.com. This cross-post on Medium for distribution.<br>This week Zepto published a 15-minute engineering deep-dive called “Building Search for a 10-Minute World.” It is genuinely good architecture writing. Llama 3–8B for query correction, a custom bi-encoder for semantic retrieval, Mixture of Experts ranking, a four-stream exploit/explore recall design. The kind of post that gets bookmarked.<br>Then I opened zepto.com without logging in and typed five queries. Three of them returned a blank page.<br>Then I logged in and typed the same five queries. All five worked perfectly.<br>This post is the teardown: what fails for guests, why their own architecture blog says this should be impossible, and why the broken path is the one that matters most for their business.<br>The experiment<br>All queries were run at the same dark store (Arjun Nagar, Delhi), on 11th June, with products in stock. Each query was run twice: once logged out, once logged in.<br>QueryLogged outLogged inmaggi noodlesPerfectPerfectmaggis noodles0 results Worksmaggi's noodles0 results Worksmother dairy dudhWorksWorksdudh of mother dairy0 results Works<br>Press enter or click to view image in full size
Press enter or click to view image in full size
Press enter or click to view image in full size
Press enter or click to view image in full size
Press enter or click to view image in full size
Press enter or click to view image in full size
Look at the failure pattern. One missing apostrophe. One extra apostrophe. One English stopword. For a guest, each of these kills a query for a brand sitting in stock at the store. Log in, and the exact same characters return a full page of products.<br>A side note that still makes me laugh: on the guest path, maggis alone works fine (a popular enough typo to be handled), but maggis noodles returns nothing. Adding a correct, clarifying word makes the results strictly worse.<br>For reference, I ran the same queries on Swiggy Instamart at the same location, also logged out. Every single one returned correct results. Whatever Zepto’s guest path is missing, their competitor does not require an account to provide it.<br>And one more detail. Zepto’s zero-results page for maggi's noodles still served me an ad. For Yippee, a competing noodle brand. The ads engine understood and monetized the exact query the search engine claimed does not exist, on the same page, in the same request.<br>Their own blog says this should be impossible<br>This is the genuinely interesting part.<br>Zepto’s architecture post is explicit about the separation of concerns in their search stack. The retrieval service, search-platform, is described as a pure function that “knows nothing about users.” No delivery address, no order history, no experiment cohort. It receives a query, a city identifier, and eligible delivery hub IDs, and returns candidates. Personalization, session data, and experiment assignment all live downstream, in the orchestration layer, where candidates get ranked and assembled into a page.<br>Take that description at face value and walk through my results. Same query string. Same city. Same store, therefore the same hub IDs. By their stated design, retrieval inputs are identical for my guest session and my logged-in session, so the candidate set must be identical too. Personalization can reorder candidates. It cannot create candidates out of a recall that returned zero.<br>Yet logging in took three queries from a blank page to a full one. Recall itself changed with login status, which is exactly what the published architecture says cannot happen.<br>I can think of three explanations, and none of them flatter the blog:<br>Guest traffic routes through a different pipeline entirely. Possibly a legacy or stripped-down search path that skips query understanding, fuzzy handling, and semantic recall. If so, the architecture in the blog describes the logged-in experience only, and a large share of real-world traffic never touches the system they wrote 15 minutes about.<br>Experiment or feature gating by login status. The richer recall stack (live correction, semantic fallback, relaxed matching) may be enabled only for authenticated cohorts. Defensible as an experiment strategy, but it means the blog’s claim that retrieval knows nothing about users is not true in practice, the user’s auth state is determining which retrieval stack runs.<br>Guest sessions fall back to a default configuration that happens to have the strictest possible matching and no recovery behavior. Less a decision than an accident of defaults, which is its own kind of finding.<br>I do not know which it is, and I would genuinely welcome a correction from...