Engineering Productivity Case Study: 2.5x in 30 Days | Poggle
Nick Morgan, Co-founder22 June 2026 · 8 min read
We just closed an amazing pilot, and the numbers are worth sharing. In a 30-day window, a 10-person engineering team increased delivery throughput by 2.5x while cutting median cycle time by almost two thirds. This engineering productivity case study breaks down what moved, by how much, and why.
The team was not rebuilt and no one was replaced. The same people, working on the same product, shipped substantially more by tightening a handful of measurable habits.
Key takeaways
A 10-person team raised delivery throughput 2.5x in a single 30-day pilot.
Median cycle time fell from 72 hours to 27 hours, the change that unlocked everything else.
The gains came from smaller changes, faster reviews, and more reliable CI, not from working longer hours.
2.5xDelivery throughputMerged changes per week versus the 30 days before the pilot.
27hMedian cycle timeDown from 72 hours at the start of the pilot.
8.0Deploys per weekUp from 2.1 deploys per week before the pilot.
£162kProductivity recoveredEstimated yearly value of the time given back to a 10-person team.
How the pilot was set up
The team was 10 engineers working on a single product with an existing CI pipeline and a regular review process. Nothing about the setup was unusual, which is the point: these are conditions most teams will recognise.
We measured a 30-day baseline first, then ran a 30-day pilot. Across both windows we tracked the same four numbers: delivery throughput, median cycle time, CI reliability, and deployment frequency.
We chose these four deliberately. They are observable from version control and CI without any subjective input, and together they describe both speed and stability. A team can game any one of them in isolation, but it is very hard to move all four in the same direction without genuinely improving how work flows.
No new headcount was added and no overtime was encouraged. The only thing that changed was how visible these metrics were and how deliberately the team acted on them.
Delivery throughput rose 2.5x
Delivery throughput, measured as merged changes that reached production each week, increased by 2.5x over the baseline. This is the headline result, but on its own it is easy to misread.
Throughput can be inflated by splitting work into more pieces without delivering more value. That is not what happened here. The team shipped more complete units of work, not more fragments, which is why the other metrics moved in step.
The increase was steady rather than a single spike. By the second week the trend was already clear, and it held through the end of the pilot.
That stability matters more than the peak. A throughput number that swings wildly week to week usually reflects luck or batching, not a healthier system. A sustained 2.5x tells us the underlying workflow changed, which is exactly what we want a pilot to prove.
Median cycle time fell from 72 to 27 hours
Median cycle time, the elapsed time from first commit to merge, dropped from 72 hours to 27 hours. This was the change that unlocked the rest.
As we have written before in why cycle time is the metric that matters most, cycle time is the clearest leading indicator of delivery health. When it falls, feedback loops tighten and most other metrics tend to follow.
The biggest lever was pull request size. Smaller changes got reviewed faster, merged sooner, and created fewer conflicts. Our guide to reducing PR cycle time covers the same levers the team leaned on.
The fastest lever*:last-child]:mb-0">If you want to move cycle time quickly, shrink pull requests first. Changes<br>under 200 lines get reviewed far faster than large ones, and the effect<br>compounds across every other metric.
CI reliability climbed from 61 to 80 percent
CI reliability, the share of pipeline runs that passed without a flaky or spurious failure, rose from 61 percent to 80 percent. Low reliability is a quiet tax on throughput.
When a pipeline fails for reasons unrelated to the change, engineers re-run jobs, lose context, and learn to distrust the signal. Each of those costs time that never shows up in a single metric but drags on the whole system.
The team treated flaky failures as defects rather than noise. Fixing the most common offenders meant a green build came to mean something again, and people stopped babysitting their pull requests through CI.
Reliability also changed behaviour upstream. Once engineers trusted the pipeline, they were willing to merge more often and in smaller increments, because a passing build was a real signal rather than a coin toss. A trustworthy pipeline is a prerequisite for everything else, not a nice to have.
Deployment frequency nearly quadrupled
Deployment frequency went from 2.1 deploys per week to 8.0. This is a direct consequence of the changes above rather than a separate initiative.
Smaller changes and a reliable pipeline make deploying...