Cache Stampede Prevention: Distributed Locking, Pub/Sub, and Request Coalescing

Stop Cache Stampedes with Distributed Locking

Engineering At Scale

SubscribeSign in

Cache Stampede Prevention: Distributed Locking, Pub/Sub, and Request Coalescing From distributed locks to request coalescing: techniques for coordinating cache rebuilds and preventing cascading failures.

Animesh Gaitonde Jun 08, 2026

Distributed Caching is a double-edged sword. On one hand, it gives you excellent performance at scale and protects your storage layer. But when the cached value expires, it can bring down the system through a domino effect known as Cache stampede . In this article, we will understand cache stampede and its relevance in building scalable and reliable systems. We will look at ways to prevent it and the pros, cons, and alternatives for each. By the end, you will understand how to build reliable systems that use a distributed cache. With that, let’s go over the basics of cache stampede.

Claude Code Beyond Prompts (Sponsored)

How do you structure context? Build reusable skills? Enforce guardrails? Create workflows your team can actually adopt?

In this live workshop, Sam Keen (AI researcher and educator, former engineer at AWS, Lululemon, and Nike, and bestselling author of Clean Architecture with Python) will show how experienced engineers are moving beyond prompts and building systems around Claude Code. Limited seats available at 50% off. Exclusive for Engineering at Scale subscribers. Use code CLAUDE50 for an exclusive 50% discount

What is a Cache Stampede?

Caches enable systems to store the value of a computation and reuse it for a fixed interval. It avoids heavy recomputation, thereby improving the system’s efficiency and performance. For example, social media websites cache a popular post for 24-48 hrs. The caching layer then serves all requests for the same post, preventing expensive database queries from being executed repeatedly. However, it is not practical to store a post in the cache indefinitely. If the database value changes, then the cache would return stale data to the users. To solve this, the caching layer supports expiry for each cached key. Once the cached key expires, the service has to fetch the value from the database/storage layer.

How expired keys are fetched and updated in cache The process is straightforward and works seamlessly - until it starts failing at scale. Imagine that a cache key expires and 1,000 services simultaneously try to fetch it. Here’s what would happen: None of the services would find the cached value.

Each service would execute a database query to compute the value.

With 1,000 services, the database would execute 1,000 such queries, increasing the load by 1,000x .

This would slow down query execution, eventually leading to timeouts and failures.

If the database is used by other services, it would impact their functioning as well.

The system could eventually collapse, resulting in downtime.

This phenomenon is known as cache stampede in distributed systems. The diagram below illustrates the working.

Cache Stampede illustrated In 2010, Facebook experienced approximately 2.5 hours of downtime due to a similar issue.

If each service has 10 threads fetching the cached value, then it would amplify the database load by 10,000x (1,000 services x 10 threads). So, with scale, the problem further worsens. Now that you understand the concept, let’s see how we can prevent this.

Cache Stampede Prevention

We will consider the example in the previous section where 1,000 services try to fetch the value simultaneously. In an ideal scenario, only one service should query the database and others should wait for the results. Cache stampede is an outcome of lack of coordination between multiple services (caching clients) that try to fetch the data simultaneously. The key to preventing a stampede is to develop a coordination mechanism among the different services (caching clients). As we learned in the previous article, distributed locking is one approach to solving this problem. Let’s examine how distributed locking can prevent it. How Distributed Locking Can Help?

The concept is simple - Before recomputing an expired cache value, acquire a distributed lock.

Update the cache once the computation is successful.

Release the distributed lock for others to acquire.

Once other services find the cached value, they can return it without acquiring the lock.

The below diagram illustrates how distributed locking allows a single service to recompute the value from the database.

Distributed locking Distributed locks guarantee mutual exclusion among different services. The service that acquires a lock, performs the heavy database operation while others wait for it to complete. The following code explains how the services implement a distributed lock to prevent cache stampede.

Distributed lock python implementation The lock implementation requires the following two parameters: timeout_deadline (Line 6) - The application must acquire a lock...

Cache Stampede Prevention: Distributed Locking, Pub/Sub, and Request Coalescing

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs