The AI Productivity Paradox: Why the AI Multiplier is Less Than 2x

7 min read Original article ↗

I'm a huge fan of LLMs for coding. Claude Code Opus 4.5 was a genuine step function improvement in capability.

What I'm not a fan of is bombastic claims of productivity gains that have not materialized. If AI were a 100x boost — a real bonafide 100x boost — then projects that previously took 5 years (260 weeks) would now take just 2.6 weeks. This is objectively not happening.

I believe two things are true.

  1. Programmers are observing genuine 30x multipliers
  2. Claude Code's net project multiplier is less than 2x

I call these two claims the AI Productivity Paradox. They sound contradictory. They are not.

I think the best approximation for the AI Productivity Paradox is Amdahl's Law.

What is Amdahl's Law? Stated plainly, the total speedup from optimizing a single part of a system is limited by the fraction of time that part takes up.

For example, an extremely common optimization technique is parallelization. If you can parallelize only 50% of your program then your absolute maximum possible speedup is just 2x.

Some problems are "embarrassingly parallel" and can be scaled to millions of nodes. However a large fraction of project development is serial — even if some components are embarrassingly parallel.

Let us now consider Amdahl's Law in the context of AI.

Software project lifecycle models vary. A reasonable but simplified model is rapidly prototype → build foundation → develop features → polish → launch.

Claude Code is a genuine 30x multiplier in the prototype phase. It can crank out 0→1 prototypes at blistering pace. Even better, it enables senior engineers to be productive in areas adjacent to their core expertise.

However Claude Code is objectively NOT a 30x multiplier for shipping novel projects. Claude has no taste and requires significant direction to build non-slop system architectures. It requires significant human review, serial user testing, etc.

The real question is NOT how much can AI accelerate individual contributors. It's how much can AI accelerate a complex multi-phase project built by a cross-disciplinary team.

Rant: If I hear one more VP wax poetic about how Claude made them 30x more productive as demonstrated by their weekend hobby project I'm going to crash out. Your weekend project that is 95% glue between open source libraries is not evidence that a diverse team can ship a complex 3 year project in 5 weeks. Claude Opus has been out for over 5 months and claims of net 30x speedup are verifiably bullshit.

This week Anthropic released their model card for Mythos. One of their interesting observations is that surveyed technical staff reported a geometric mean productivity lift of 4x. However Anthropic observed research progress uplift of less than 2x. Worse still, they estimate that achieving 2x research progress would require an order of magnitude more researcher uplift.

To rephrase:

  1. Anthropic researchers self-reported ~4x multiplier
  2. Anthropic measured net research progress to be less than 2x
  3. Anthropic estimated 2x net progress would require an additional 10x more researcher uplift (ie, around 40x)

This affirms my biases! 😊

Anthropic's key serial bottlenecks appear to be raw wall clock compute time to train models followed by human review of results. Interestingly, numerous claims of autonomous AI-research improvements were ultimately determined to be the product of human direction.

The end result is that Anthropic with the world's best researchers with the greatest models and billions in compute are seeing less than 2x net project acceleration. That's an important data point.

To be clear my claim that AI is a less than 2x net multiplier is only true for today. This post is largely a push back about people overstating the claims of what AI can do right now this second. I am not attempting to predict future capabilities.

Amdahl's Law gives the right intuition - serial bottlenecks dominate. Model improvements will increase both productivity multipliers and the fraction of a project that can be accelerated. Anthropic's framing is elasticity and diminishing returns rather than a hard ceiling. In either case the practical implication is the same: current models have a net project multiplier less than 2x.

After writing this post I'm beginning to think the industry is focusing too much on "how many times faster does AI make you individually" and not enough time on "what percent of the project is stuck at 1x". This is especially true when you ignore solo hobby weekend projects and focus on real projects with a team of cross-disciplinary contributors.

I am a sucker for concrete examples and interactive widgets. Here's a demonstration that I think is representative of the types of projects I've worked on throughout my career. Which has been about half gamedev and half VR research.

An extremely naive plan assumes linear progress. Halfway there, halfway done.

We've all seen this plan. It never works.

Although every project is a different, I think it's fair to say that most projects go through ebbs and flows. You have periods of slowness followed by bursts of productivity. I think this plot has truthiness to it.

It starts off with a highly efficient prototype phase. There is a slower period of foundation building. Followed by a burst of feature content. Finally, anyone who has shipped knows that shipping is a long and grueling grind.

Hofstadter's Law states "It always takes longer than you expect, even when you take into account Hofstadter's law". Therefore your initial 3-year plan becomes a 5-year plan.

I'm deathly afraid of the nit picks I'm sure every reader has. Blizzard games were famous for spending YEARS in the polish phase. Modern AAA games genuinely take upwards of 7 years for a new IP. Small indie games may well average 3 years.

VC funded SaaS MVPs follow a different cadence. Research labs are further distinct. Every project is a unique and beautiful snowflake. The shared truthiness is that all projects take longer than you think and all have to grind their way out of the trough of disillusionment.

How might Claude realistically accelerate this timeline? I think it's plausible to say something like:

Prototype: 30x. Scaffolding, boilerplate, code generation, etc. Claude is genuinely transformative for 0→1 work

Foundation: 2x. Claude has no taste and requires humans to craft well designed architectures.

Feature: 3x. Once patterns are established AI is great at following conventions and cranking out features.

Polish: 0.8x. One of the consequences of Claude Slop is it eventually slows you down. Heaven help you the first time you're forced to maintain someone else's vibecoded slop after they leave the team.

My multipliers may well be wrong. That's why I've built an interactive plot you can play with.

This is all a very coarse approximation. I think it's a useful lens. YMMV.

At the start I made two contradictory observations that I named the AI Productivity Paradox:

  1. Programmers are observing genuine 30x multipliers
  2. Claude Code's net project multiplier is less than 2x

The first observation is widely reported across the internet. The second is aligned with Anthropic's own internal measurements.

This paradox is resolved by Amdahl's Law. It explains why a 30x improvement in prototyping speed results in a small total project speedup. Thus both observations are true.

The industry has been so focused on asking “how much faster are individuals going” that we lost sight of the bigger picture. The right question may be “how much faster can you ship”. Today, for many projects the answer is less than twice as fast.

Thanks for reading.