Lines-of-code considered helpful (or we stop taking sick peoples temperatures)

Lines-of-code considered helpful (or we should stop taking sick peoples temperatures)

Samuel Spencer

SubscribeSign in

Lines-of-code considered helpful (or we should stop taking sick peoples temperatures) Wherein I realise pointed-headed bosses were right the whole time, and kloc is actually a really good metric for measuring teams.

Samuel Spencer Jun 20, 2026

Lines-of-code is back in the news as a metric of organisational success. This resurrection has come in the form of organisations declaring the percent of their codebase now written using AI.

Naturally, programmers have been quick to point out how this is just a different way of talking about lines-of-code as a way of measuring engineering teams, and how it is just as dumb as it always was. Except it’s not. It’s actually a pretty powerful and underused metric, and shock - we are using it all the time, especially now that AI has shown up. Thanks for reading! Subscribe for free to receive new posts and support my work.

So a bit of background about me: I was (and some might agree, still am) a software engineer. I am also a startup founder and CEO, and because of this committed the most treasonous crime known to programmers - I did an MBA. Mostly because, if I was going to sit across from other MBAs, I needed to know what they knew. And it turns out, they do know some important things, we need to know. Now. Back to lines of code… Lines-of-code was first introduced as a metric as early as the 1950s. Driven by managerial attempts to understand what programmers were doing, lines-of-code seemed like an easy way to track and measure programmer productivity. Since then, every few years lines-of-code re-enters the managerial mindset. As the graphic below shows, we’ve seen this many times before.

Source: Everyone on LinkedIn. Honestly, there are so many unattributed versions of this now. Depending on who you ask, software engineering is either artistic, academic, philosophical, or scientific. A craft, not just a job. And, software engineers being a savvy, unruly bunch have decided that trying to reduce a craft to a number on a dashboard is sacrilege. But the increase in AI has flipped all of that on its head. When a new library or tool is adopted on GitHub, and released to HackerNews, people immediately look to the number of lines shipped, number of commits, and the age of the repo. But this the same thing we have decried the managerial class for, for so long! Too many lines of code, too quickly and we insinuate call it slop (and probably because it is). So why are we so comfortable using LOCs to measure output, but refuse to be measured by it ourselves.

Segue: anyone who goes to hospital knows that every patient has their temperature taken. Sometimes every 15 minutes. We even do it at home. If there a person temperature is normal, we do nothing. Too cold, give them a blanket. But if someone has a fever, the solution isn’t dunk them in ice to bring their temperature back to normal. The most common cause of a temperature is infection, and the most common treatment is Panadol (aka. Tylenol). If that direct work, doctors will draw blood and run tests to look for other causes. Even though taking a temperature doesn’t tell you what is wrong, it is a hugely powerful and simple diagnostic tool. If someone’s temperature raises, and continues to increase, this gives doctors and nurses a clear indication of their health on its own. A raised temperature also isn’t the root problem, it’s an indicator of something worse. We don’t just treat the temperature, we tear the cause and the temperature resets itself.

So why is this important for code? Because business metrics come in two forms - targets and diagnostic measures. Targets are like things white blood cell count. Hard to measure, but thinks that we want to happen. While diagnostic measures, like temperature, are used to determine when things are wrong. Lines of code should never be a target. It’s dumb, for all of the reasons we already know is dumb. But LOCs can be a powerful diagnostic measure on a team and individual level, which is why we so naturally use it now against AI. For example, take the following three teams:

Three teams, average lines of code per team member per month. And yes, ChatGPT helped with this. It would be foolish to say that “2000 lines of code is the ideal target”, but what we can see very quickly is: Team A has had a massive increase in productivity.

Team B has remained relatively consistent over time.

Team C has had a drop off in output.

From a team lead perspective, a few questions can be identified from this, for example, “Is Team C working on substantially harder tasks”. More importantly, in a work environment, there is an expectation of peer review. So when we criticise AI generated libraries, we are looking at them the way we look at Team C - “given the increase in lines-of-code, is the same amount of time being dedicated to peer review”. This can also scale...

Lines-of-code considered helpful (or we stop taking sick peoples temperatures)

Related Articles

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

German ruling declares Google liable for false answers in AI Overviews