← Back to blog
·5 min read

93% of Developers Use AI. Productivity Moved 10%. What Happened?

A landmark study found AI tools made experienced developers 19% slower - while they believed they were 20% faster. The 39-point perception gap reveals something important about how we adopt new tools and measure their value.

Developer ProductivityAI ToolsEngineeringResearch

Somewhere around mid-2025, a quiet consensus formed across the software industry: AI coding assistants were the biggest productivity unlock since the invention of the IDE. By early 2026, the numbers seemed to back it up. A survey of 121,000 developers across 450 companies found that 92.6% use an AI coding assistant at least monthly. Claude Code went from 4% developer adoption to 63% in nine months. GitHub Copilot crossed 15 million users. The adoption curve was nearly vertical.

Then the data started coming in. Not the vibes, not the demos, not the "I built this app in 20 minutes" posts. The actual, controlled, randomized data. And it told a very different story.

The Study That Broke the Narrative

In mid-2025, METR - an AI safety research organization - published what remains the most rigorous study on AI coding productivity to date. They recruited 16 experienced open-source developers working on mature repositories averaging 22,000+ stars and over a million lines of code. Each developer completed real tasks - bug fixes, features, refactors - with AI access randomly assigned per task. The total dataset covered 246 tasks.

The result: developers using AI tools took 19% longer to complete their work.

That alone would be noteworthy. But the finding that should keep every engineering leader up at night is this: the same developers estimated that AI had sped them up by 20%. Before the study, they predicted a 24% speedup. After experiencing the measured slowdown, they still believed they'd been faster.

That's a 39-point gap between perception and reality. Developers didn't just fail to notice they were slower. They were convinced of the opposite.

The Dopamine Problem

The METR result isn't an indictment of AI tools. It's an indictment of how we measure productivity, and how easily our brains are fooled by speed at the wrong level of abstraction.

When you prompt an AI assistant and watch 40 lines of code materialize instantly, something happens in your brain. You've just skipped the most tedious part of programming - the typing, the syntax lookup, the boilerplate. That feels fast. It releases the same satisfaction you'd get from actually finishing the task. Multiple researchers have described this as the "illusion of velocity" - the subjective experience of speed without the objective outcome.

The problem is that those 40 lines of code need to be read, understood, verified, and integrated. And in a mature codebase with years of accumulated architectural decisions, context, and edge cases, that verification work often takes longer than writing the code from scratch would have. The developer who writes code manually builds understanding as they type. The developer who accepts AI-generated code has to reverse-engineer that understanding after the fact.

Scoped Tasks vs. Real Work

The perception gap becomes clearer when you look at what AI tools actually accelerate. Controlled experiments consistently show 30-55% speedups on scoped programming tasks - writing a function, generating tests, producing boilerplate. These are real, reproducible gains. If your job consists entirely of isolated, well-defined coding tasks, AI tools genuinely make you faster.

But almost nobody's job looks like that. Real software engineering involves reading existing code, understanding system architecture, navigating tradeoffs between competing requirements, coordinating with other humans, and making judgment calls that no benchmark captures. The coding part - the part AI accelerates - often represents a fraction of the total effort.

This is why the organizational numbers are so underwhelming. Across multiple large-scale studies, the consensus estimate for actual productivity improvement at the team level sits around 10%. One analysis of 10,000+ developers found that teams with high AI adoption merged 98% more pull requests - but review time increased 91%, and DORA delivery metrics remained flat. They produced more code. They didn't ship more value.

The Quality Tax

There's another cost hiding in the productivity numbers that most organizations aren't tracking. Independent code analysis has found that AI-assisted development can increase issue counts by roughly 1.7x and elevate security findings when proper governance isn't in place.

A study of 67,000 developers revealed a sharp bifurcation: some companies saw customer-facing incidents cut in half after adopting AI tools, while others saw incidents double. The difference wasn't the tools. It was the organizational maturity around review, testing, and deployment processes. AI acted as a force multiplier in both directions - amplifying whatever was already there, whether that was engineering discipline or chaos.

This is the part that doesn't show up in any productivity dashboard. If your team adopts AI coding tools and your defect rate quietly climbs 70%, you haven't gained 10% productivity. You've probably lost it. But you won't know that from measuring lines of code or PRs merged.

What This Means for the Industry

The AI productivity illusion has real consequences beyond individual teams. Companies are making headcount decisions based on perceived rather than measured productivity gains. Hiring of new graduates at the top 15 U.S. tech companies has dropped 55% since 2019, according to SignalFire. Computer engineering graduates now face 7.5% unemployment - nearly double the national average. CS enrollment across the UC system has declined for two consecutive years.

Some of these shifts are justified. Some are premature reactions to capabilities that haven't materialized at the organizational level. The risk is that companies cut too deep based on demo-quality expectations, then discover that the developers they let go were carrying context and judgment that no AI tool can replicate.

Meanwhile, METR's attempt to run a follow-up study in late 2025 hit a revealing obstacle: 30-50% of developers refused to submit tasks where AI wouldn't be allowed. The tool had become so embedded in their workflow that working without it felt unbearable - even though the data suggested it was making them slower. That's not productivity. That's dependency.

Measuring What Matters

None of this means AI coding tools are useless. The 10% organizational productivity gain is real and meaningful at scale. Developers save an average of 3.6 hours per week on routine tasks. Daily AI users merge roughly 60% more PRs. For greenfield projects and well-scoped work, the acceleration is genuine.

The problem is the gap between what we measure and what we assume. Developer surveys consistently report 20-30% perceived speedups. Controlled task-level benchmarks show 30-55% gains. But organizational delivery metrics show 10%. Somewhere between the benchmark and the boardroom, 80% of the expected value disappears.

That vanishing value isn't a mystery. It's hiding in code review queues that grew 91%. In defect rates that nobody tracks at the AI-tool level. In context-switching costs when developers spend more time reading generated code than they would have spent writing it. In the architectural debt that accumulates when it's easier to generate new code than to understand existing code.

The companies getting real value from AI coding tools aren't the ones with the highest adoption rates. They're the ones that invested equally in the unglamorous work of adjusting their review processes, updating their quality gates, and training their teams to use AI as a complement to understanding rather than a replacement for it. According to EU research, each additional percentage point spent on workforce training added 5.9 percentage points to AI productivity gains.

The tool isn't the bottleneck. The measurement is. And until engineering organizations get serious about measuring outcomes rather than output, the illusion will keep writing checks that reality can't cash.

Costly

Your AI is Costly.
Let's fix that.

One install. 7 waste detectors. Every wasted dollar, found.