Data availability
The data are available on Zenodo (https://zenodo.org/records/17792605 (ref. 41)) and OSF (https://osf.io/8wsqx/). See OSF for our preanalysis plan.
Code availability
Change history
23 April 2026
A Correction to this paper has been published: https://doi.org/10.1038/s41586-026-10503-w
References
Vazire, S. Quality uncertainty erodes trust in science. Collabra Psychol. 3, 1 (2017).
Donoho, D. L., Maleki, A., Rahman, I. U., Shahram, M. & Stodden, V. Reproducible research in computational harmonic analysis. Comput. Sci. Eng. 11, 8–18 (2008).
King, G. Replication, replication. Polit. Sci. Polit. 28, 444–452 (1995).
Goodman, S. N., Fanelli, D. & Ioannidis, J. P. What does research reproducibility mean? Sci. Transl. Med. 8, 341ps12 (2016).
Marcoci, A. et al. Predicting the replicability of social and behavioural science claims in COVID-19 preprints. Nat. Hum. Behav. 9, 287–304 (2025).
Milkowski, M., Hensel, W. M. & Hohol, M. Replicability or reproducibility? On the replication crisis in computational neuroscience and sharing only relevant detail. J. Comput. Neurosci. 45, 163–172 (2018).
Moonesinghe, R., Khoury, M. J. & Janssens, A. C. J. W. Most published research findings are false—but a little replication goes a long way. PLoS Med. 4, e28 (2007).
National Academies of Sciences, Engineering, and Medicine. Reproducibility and Replicability in Science (National Academies Press, 2019).
Peterson, D. & Panofsky, A. Self-correction in science: the diagnostic and integrative motives for replication. Soc. Stud. Sci. 51, 583–605 (2021).
Pérignon, C., Gadouche, K., Hurlin, C., Silberman, R. & Debonnel, E. Certify reproducibility with confidential data. Science 365, 127–128 (2019).
Brandon, A. & List, J. A. Markets for replication. Proc. Natl Acad. Sci. USA 112, 15267–15268 (2015).
Freese, J. & Peterson, D. Replication in social science. Annu. Rev. Sociol. 43, 147–165 (2017).
Gertler, P., Galiani, S. & Romero, M. How to make replication the norm. Nature 554, 417–9 (2018).
Maniadis, Z. & Tufano, F. The research reproducibility crisis and economics of science. Econ. J. 127, F200–F208 (2017).
Munafò, M. R. et al. A manifesto for reproducible science. Nat. Hum. Behav. 1, 0021 (2017).
Nosek, B. A. et al. Replicability, robustness, and reproducibility in psychological science. Annu. Rev. Psychol. 73, 719–748 (2022).
Askarov, Z., Doucouliagos, A., Doucouliagos, H. & Stanley, T. The significance of data-sharing policy. J. Eur. Econ. Assoc. 21, 1191–1226 (2023).
Brodeur, A., Cook, N. & Neisser, C. P-hacking, data type and data-sharing policy. Econ. J. 134, 985–1018 (2024).
Chang, A. C. & Li, P. Is economics research replicable? Sixty published papers from thirteen journals say ’often not’. Crit. Finance Rev. 11, 185–206 (2022).
Christensen, G. & Miguel, E. Transparency, reproducibility, and the credibility of economics research. J. Econ. Lit. 56, 920–80 (2018).
Dafoe, A. Science deserves better: the imperative to share complete replication files. Polit. Sci. Polit. 47, 60–66 (2014).
McCullough, B., McGeary, K. A. & Harrison, T. D. Do economics journal archives promote replicable research?. Can. J. Econ. 41, 1406–1420 (2008).
Pérignon, C. et al. Computational reproducibility in finance: evidence from 1,000 tests. Rev. Financ. Stud. 37, 3558–3593 (2024).
Camerer, C. F. et al. Evaluating replicability of laboratory experiments in economics. Science 351, 1433–1436 (2016).
Camerer, C. F. et al. Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nat. Hum. Behav. 2, 637–644 (2018).
Open Science Collaboration. Estimating the reproducibility of psychological science. Science 349, aac4716 (2015).
Dreber, A. & Johannesson, M. A framework for evaluating reproducibility and replicability in economics. Econ. Inq. 63, 338–356 (2025).
Brodeur, A., Dreber, A., Hoces de la Guardia, F. & Miguel, E. Replication games: how to make reproducibility research more systematic. Nature 621, 684–686 (2023).
Simonsohn, U., Simmons, J. P. & Nelson, L. D. Specification curve analysis. Nat. Hum. Behav. 4, 1208–1214 (2020).
Brodeur, A., Lé, M., Sangnier, M. & Zylberberg, Y. Star Wars: the empirics strike back. Am. Econ. J. 8, 1–32 (2016).
Brodeur, A., Cook, N. & Heyes, A. Methods matter: p-hacking and publication bias in causal analysis in economics. Am. Econ. Rev. 110, 3634–3660 (2020).
Botvinik-Nezer, R. et al. Variability in the analysis of a single neuroimaging dataset by many teams. Nature 582, 84–88 (2020).
Breznau, N. et al. Observing many researchers using the same data and hypothesis reveals a hidden universe of uncertainty. Proc. Natl Acad. Sci. USA 119, e2203150119 (2022).
Huntington-Klein, N. et al. The influence of hidden researcher decisions in applied microeconomics. Econ. Inq. 59, 944–960 (2021).
Menkveld, A. J. et al. Nonstandard errors. J. Finance 79, 2339–2390 (2024).
Silberzahn, R. et al. Many analysts, one data set: making transparent how variations in analytic choices affect results. Adv. Methods Pract. Psychol. Sci. 1, 337–356 (2018).
Fišar, M. et al. Reproducibility in Management Science. Manag. Sci. 70, 1343–2022 (2024).
Ankel-Peters, J., Fiala, N. & Neubauer, F. Do economists replicate?. J. Econ. Behav. Org. 212, 219–232 (2023).
Vilhuber, L., Turrito, J. & Welch, K. Report by the AEA Data Editor. AEA Pap. Proc. 110, 764–765 (2020).
Wood, B. D., Müller, R. & Brown, A. N. Push button replication: is impact evaluation evidence for international development verifiable?. PLoS ONE 13, e0209416 (2018).
Brodeur, A. Replication package for “Computational reproducibility and robustness of empirical economics and political science research between 2022 and 2023” [Data set]. Zenodo https://doi.org/10.5281/zenodo.17792605 (2025).
Acknowledgements
We acknowledge support from Coefficient Giving and the Social Sciences and Humanities Research Council. Any views expressed herein are the authors’ personal opinions and not those of Ontario Public Service. The work by J.D.G. was not undertaken under the auspices of the Ontario Public Service as part of his employment responsibilities. The views expressed in this paper are those of the authors. No responsibility for them should be attributed to the Bank of Canada. The findings, interpretations, and conclusions expressed in this work are entirely those of the authors and do not necessarily reflect the views of the World Bank or its Board of Directors. The Center for Crisis Early Warning (Kompetenzzentrum Krisenfrüherkennung) is funded by the German Federal Ministry of Defense and the German Federal Foreign Office. The views and opinions expressed in this article are those of the author(s) and do not necessarily reflect the official policy or position of any agency of the German government. The views expressed in this paper are those of the authors and do not necessarily reflect the position of the Banco de España or the Eurosystem. All remaining errors are the authors’ responsibility.
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature thanks Colin Camerer who co-reviewed with Anastasia Buyalskaya; T. D. Stanley; and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer review reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Ten-point computationally reproducibility score.
Each team assigned a reproducibility score on a scale of one to ten to the paper reproduced. See Supplementary Materials for a description of each score. Level 10 (L10) means that all necessary materials are available and produce consistent results with those presented in the paper, while level 5 (L5) means that analytic data sets and analysis code are available and they produce the same results as presented in the paper.
Extended Data Fig. 2 Reasons for selecting paper? (Select all which apply).
Data collected via survey of our reproducers after completing their reports. This figure illustrates the responses to the question: ‘For what reasons did you select your specific paper to reproduce and/or replicate from the list of papers provided?
Extended Data Fig. 3 Percentage of papers with a replication folder.
The total sample is 1150 papers with 120 papers per year from 2019 to 2023 and 110 papers per year from 2018 to 2014. Each journal has 10 papers per year except American Economic Review: Insights which only formally became a journal in 2019 (and are omitted in earlier years). The journals sampled over correspond to those used in the manuscript’s main analysis, three from political science and nine from economics. The political science journals include: American Journal of Political Science, American Political Science Review, and Journal of Politics. The economics journals include: American Economic Review, Quarterly Journal of Economics, Review of Economic Studies, Journal of Political Economy, American Economic Journal: Macroeconomics, American Economic Journal: Applied Economics, American Economic Journal: Economic Policy, American Economic Review: Insights, Economic Journal.
Extended Data Fig. 4 Percentage of papers with a replication folder by discipline.
Panel (a) is for papers published in economics journals where Panel (b) is for papers published in political science. The total sample is the same as Extended Data Fig. 3 is 1150 papers, where 850 papers are in the economics sample and 300 papers are in the political science sample.
Extended Data Fig. 5 Percentage replication folders’ with contents conditional on they should have a replication folder.
Each subfigure represents the proportion of the replication folders which affirmatively (‘Yes’) contained the variable (displayed as the title). The ‘Not Yes’ in the legend corresponds to those replication folders which did not affirm (‘No’) or had only ‘Some’ of the required contents. Each sample is over those observations where categories are applicable (i.e. not all replication packages require the same contents).
Extended Data Fig. 6 Percentage replication folders’ with contents conditional on they should have a replication folder.
Each subfigure represents the proportion of the replication folders which affirmatively (‘Yes’) contained the variable (displayed as the title). The ‘Not Yes’ in the legend corresponds to those replication folders which did not affirm (‘No’) or had only ‘Some’ of the required contents. Each sample is over those observations where categories are applicable (i.e. not all replication packages require the same contents).
Extended Data Fig. 7 Reasons unable to conduct robustness checks.
This Figure illustrates the share of teams who were unable to perform robustness checks (top-left), replications (top-right), key variable recodes (bottom-right) or extensions (bottom-left) for various reasons represented by the different coloured bars.
Extended Data Fig. 8 Distributions of t-statistics for original studies and re-analyses.
The top panels display a histogram of test statistics for t ∈ [0, 5], with bins of width 0.1. The top left panel includes all original studies in our data set. The top right panel includes all re-analysis estimates in our data set. Vertical reference lines are displayed at conventional two-tailed significance levels. We superimpose an Epanechnikov kernel (which includes renormalization at 0). The bottom figures display histograms of test statistics for p-values ∈ [0.0025, 0.1500], with bins of width 0.0025, among original studies and those from re-analyses, respectively.
Extended Data Fig. 9 Distributions of t-statistics and P-values by field.
We restrict the sample to articles published in the indicated field journals. Top panels display histograms of test statistics for t ∈ [0, 5], with bins of width 0.1 respectively. Vertical reference lines are displayed at conventional two-tailed significance levels. We superimpose an Epanechnikov kernel density curve (which includes renormalization at 0). Bottom panels display histograms of test statistics for p-values ∈ [0.0025, 0.1500], with bins of width 0.0025.
Extended Data Fig. 10 Relative reproduced effect size.
48% of relative effect sizes are exactly equal to or greater than 1. This figure illustrates the ratio of re-analysis estimates and original estimates. The standardized effect sizes are normalized so that 1 equals the original effect size. A positive value indicates that the re-analysis estimate is in the same direction as in the original study. A negative value indicates that the re-analysis estimate is not in the same direction as in the original study. Outliers (3%) are excluded for visibility.
Supplementary information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Brodeur, A., Mikola, D., Cook, N. et al. Reproducibility and robustness of economics and political science research. Nature 652, 151–156 (2026). https://doi.org/10.1038/s41586-026-10251-x
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41586-026-10251-x