The measurement problem in software engineering
maestroai.substack.comLike most things, all measures fail point in time and on short time horizons, but law of big numbers and trends can be useful. I.e. compare yourself (on almost any measure) to yourself in your most productive period.
We reviewed our metrics for period-on-period comparisons at 1, 2, 3, 4 years and the numbers are surprisingly consistent for each person, and across similar productive engineers.
Like in the article, if you can apply a semantic score across years of data, it gives you a pretty good idea.
Curious what you use for a semantic score across years of data?
Periodic objectives x customer results then GPT-5 scoring pull requests, etc. against them roughly aligned to that period. I.e. it scores higher if code is used by and produces value for customers.
tldr; For fifty years, pretyt much every major approach to measuring developer productivity has eventually been disavowed by its own creators. Lines of code, story points, velocity. Tom DeMarco wrote "you can't control what you can't measure" in 1982 and publicly retracted it in 2009. AI has made the problem worse and, weirdly, might also be the way out.