A Little Explanation of Little's Law |<br>ruguA Little Explanation of Little's Law<br>Written at 2026-06-06I recently read Concurrency in Go by Katherine Cox-Buday. In the “Queuing” section, there was a discussion of how we can use Little’s Law to predict our pipeline’s throughput, given sufficient sampling.<br>I honestly wondered why I had not come across this simple idea before, after finishing that part. As I understand it, it can potentially be used in almost any situation where a queue is involved. Not just message queues, even things like physical queues. So I thought I’d write an intuitive explanation to help it stick and share the idea.<br>Some Intuition for The Little’s Law<br>Suppose that you are the owner of a coffee shop. Your coffee shop is located in a busy area, so your place is usually active. Unfortunately (or fortunately, depending on where you look from), at certain hours of the day, the demand for coffee increases so much that you become unable to serve it at the rate people join the line. As the line gets longer and longer, some people who would otherwise join decide not to get in the queue. So, you lose some of the potential customers that you otherwise would not lose.<br>Here is a simple animation that demonstrates this intuition: customers arrive, wait in line, get served, or leave when the line is already full.
Look at how many customers you’re losing just because you can’t keep up with the customer arrival rate! Being a smart coffee shop owner, you pause for a moment and think, maybe during rush hours, I can increase my throughput to match the rate at which people are getting into the line.<br>After thinking a bit, you realize that there are essentially three important parameters that you should be looking at:<br>The length of the line is where new customers are still willing to join, so that you don’t lose any customers. Let’s call this LL.<br>The amount of time it takes to serve for a customer to get their coffee after they have joined to the queue. In other words, how long each customer waits until their order is completed. Let’s call this WW.<br>The rate at which customers join the line, that is, how many customers per minute arrive. Let’s call this λ\lambda.<br>Now, here, there is a relation between these parameters. If you don’t want to lose any customers, LL simply needs to be greater than or equal to WλW\lambda. Otherwise, we have more customers than we can handle; The queue grows, and some customers decide not to join.<br>Since we cannot control the customer arrival rate, the best we can do is reduce the time customers spend in the system, WW. For this, we can just try to add another coffee machine and hiring another employee.<br>With the same arrival rate, adding a second service line gives customers another path through the system. The individual orders does not become faster, but the average customer spends less time waiting for service to begin.
Since we’ve reduced the average waiting time for each customer simply by adding a new line, we’re now in a much better position than before and can avoid losing customers unnecessarily.<br>We just examined Little’s Law through an example, I think it’s now a good time to talk a bit more about the generalized version of the law itself.<br>The Little’s Law<br>Now, if we are to generalize this relation in a more abstract way, we arrive at the equation L=λWL = \lambda W, where:<br>LL is the average number of items in the system,<br>λ\lambda is the average arrival rate of items, and<br>WW is the average time an item spends in the system.<br>Here is a small simulator that shows how changing LL, λ\lambda, or WW affects the flow of items through the system.
Now, you might question that, in the first example, it was not necessary for the number of users willing to join (LL) to equal the rate at which we dequeue users (WλW\lambda). Here, the difference is that we have defined the L,λL, \lambda, and WW in such a way that it is logically not possible to break this formula. Katherine Cox-Buday, in the Concurrency in Go book, shortly notes an important point as follows:<br>The equation only applies to so-called stable systems. In a pipeline, a stable system is one in which the rate that work enters the pipeline, or ingress, is equal to the rate in which it exits the system, or egress. If the rate of ingress exceeds the rate of egress, your system is unstable and has entered a death-spiral. If the rate of ingress is less than the rate of egress, you still have an unstable system, but all that’s happening is that your resources aren’t being utilized completely.
Now, we can still use this “law” as a target-setting tool even when our systems are unstable: Simply treat our systems as if they were stable, and determine the values need to make our system work properly.<br>Consider we have an API receiving 50 requests per second. Say we want to keep the average response time under 200 ms so that users don’t get frustrated. We can estimate the number of concurrent...