Arcade Blog · Part 4 of 5
Volume Is Not Velocity
Volume Is Not Velocity
GitClear analyzed 211 million changed lines of code across repositories owned by Google, Microsoft, Meta, and enterprise corporations. The dataset spans January 2020 through December 2024. Five years. The full before-and-after of AI coding tool adoption.
Here's what they found.
Refactoring dropped from 25% of changed lines in 2021 to less than 10% in 2024. Code duplication — copy/paste clones — rose from 8.3% to 12.3%. 2024 was the first year in GitClear's dataset history that duplicated code exceeded refactoring activity.
And the frequency of code blocks — five or more lines — that duplicate adjacent code? Up eightfold compared to two years prior.
We are writing more code than ever. And the code is getting worse.
I want to be precise about what "worse" means here, because it's easy to dismiss.
This isn't about bugs. It's not about whether the code runs. AI-assisted code generally runs fine. It passes tests. It ships. The dashboards stay green.
The problem is structural. Duplication up. Refactoring down. That's a codebase that's growing without being maintained. It's the difference between a city that builds new roads and a city that also fixes the old ones. Both cities get bigger. Only one of them stays navigable.
GitClear's framing is pointed: AI output "resembles that of a developer unfamiliar with the projects they are altering." The AI doesn't know the codebase's history. It doesn't know that the utility function it's about to rewrite already exists three directories over. It doesn't know that the pattern it's generating was deliberately refactored out six months ago because it caused a memory leak under load.
So it duplicates. Confidently. Fluently. With perfect syntax and no understanding of the context it's operating in.
Refactoring is the part that concerns me most.
A codebase that isn't being refactored is a codebase that's accruing debt at every level. Not the dramatic kind — not the "we built it wrong and have to start over" kind. The quiet kind. The kind where there are now four implementations of the same logic instead of one. Where the naming conventions have drifted. Where new developers can't tell which version of a function is canonical because there are three, and all of them work, and none of them reference each other.
Refactoring is the immune system of a codebase. It's the process by which a living system recognizes redundancy and eliminates it. When refactoring drops from a quarter of all changes to less than a tenth, the immune system is suppressed. The organism keeps growing. It just stops being coherent.
Why did refactoring drop?
Two reasons, and they compound.
First, AI tools are better at generating new code than at understanding existing code. Ask an AI to write a new function and it will produce something plausible. Ask it to refactor an existing module and it needs to understand the dependencies, the callers, the tests, the deployment context, the reason the code is shaped the way it is. That requires a kind of holistic awareness that current tools don't have. So the tools generate. They don't consolidate.
Second, refactoring doesn't get celebrated. Nobody gets a Slack emoji for reducing the line count. Nobody's sprint review includes "I deleted 400 lines of duplicate logic and the system is now easier to reason about." The incentives point toward features shipped, PRs merged, tickets closed. AI tools amplify the work that gets measured. The work that doesn't get measured — the maintenance, the consolidation, the careful reduction — falls further behind.
The tools and the incentives are aligned in the same direction. More output. Less coherence.
Here's the craft argument.
Productive and generative are not the same thing. A developer who writes 500 lines a day and a developer who writes 50 lines a day and deletes 200 are not comparable on output metrics. But the second developer is often the one keeping the system alive.
Volume is the easiest thing to measure and the worst proxy for progress. More code is not better code. More features are not better products. More PRs are not more velocity. Velocity implies direction. Volume is just mass.
When an organization reports that AI tools increased their output by 20%, the right question is: output of what? More features? More duplication? More code that works today and creates a debugging nightmare in six months? The GitClear data suggests the answer is "all of the above," and the ratio is shifting toward the bad kind.
I'm not anti-tool. I use AI tools every day. They're in my workflow at this point in a way that would be hard to reverse. But I also refactor what they give me. I read the output. I check it against the codebase I already know. I delete the duplication. I rename the variables. I do the work that the tool can't do because the tool doesn't know what the codebase is supposed to look like — only what it looks like right now.
That extra step is the difference between using AI as a generator and using it as a collaborator. The generator produces volume. The collaborator produces velocity. The generator makes the codebase bigger. The collaborator makes it better.
The GitClear numbers tell us that most teams are using the generator. The refactoring drop is the proof. The duplication rise is the symptom. And the long-term cost is a codebase that nobody understands because nobody was maintaining it — they were just adding to it.
Volume is not velocity. It never was. The tools just made it easier to confuse the two.