Fraud and the false optimism of AI for science

10 min read Original article ↗

This is Jessica. “Scientific doomerism” seems to be everywhere lately, from a presidential statement that promises to restore “gold standard science” from the top down because scientists have botched things, to journals being inundated with AI-produced papers, to sleuths like Reese Richardson documenting the scale of organized scientific fraud through paper mills and collusion. In his post on this last example, Andrew wrote something that caught my attention: 

And, yes, typing some prompts into a chatbot and producing a paper is fraud, in the same way that publishing textbook excerpts as if it were new research is fraud, or copying from wikipedia as if it were new research is fraud, etc etc. It doesn’t require fake data and it doesn’t require some cackling Snidely Whiplash attitude. It can be some schlub sitting at a computer terminal who wants to get his contract extended or get admitted to a Ph.D. program or whose adviser is pressuring him to get some publications . . . But it’s fraudulent publication, not the same as bad research (which is actually research, it just happens to be useless because bad measurement and kangaroo).

There are clearly some differences between passing off work containing fake evidence you purchased from paper mills and work that you contracted AI to do for you–in one case you are paying money for someone to pass something fake off as real with your name on it, whereas in the other you might be using actual data and think you’re just saving time, especially if you’re reviewing what the AI does. So should we really consider both forms of fraud? 

It struck me how sharply Andrew’s perspective contrasts with the current direction of discussions among ML and other researchers interested in AI for science. There, it’s seen as inevitable and not necessarily morally problematic that the future of science will have humans largely playing the role of curators, who prompt and select among results produced by LLM agents, who do the bulk of the work. It’s worth considering what kinds of ethical lines this crosses exactly. 

Let’s imagine that I give an LLM an initial high level research question related to a topic on which I am knowledgeable. It churns on the idea and ultimately designs an experiment it’s happy with, I review the plan before prompting it to continue, maybe tweaking slightly, like changing a condition or suggesting an additional robustness check. It then gathers data on my behalf (e.g., running an online experiment or downloading existing datasets), conducts the analysis, and presents me with the results. I review these and then give it permission to write up a paper. I read the final paper to make sure I know what it’s saying before I submit. Maybe I change a few things I don’t agree with. I add my name and also credit the AI. In other words, there is a light human touch throughout, but much of what is presented as my work comes from the model.

From an “optimistic” AI-for-science perspective, the strongest argument is probably to cast it as part of the scientist’s job to try to make the most of current technology. If we think AI might help us be more productive, then we should explore how much time it can save us, just like it was a good move for statistics to embrace the computational revolution that made previously intractable models commonplace. Proponents of AI for science argue that it is irresponsible not to use AI given its current capabilities, just like it could be construed as irresponsible for a brilliant researcher to refuse to use calculators if doing so meant they could contribute more useful advances to the field. Of course, this assumes that we won’t be sacrificing anything vital in the process. 

The “pessimistic” view of AI generated science as fraud thinks we are sacrificing something vital in the process. But what is it exactly? If you believe that “the devil is in the details” (or “God is in every leaf of every tree”, depending whose side you want to be on) then whenever you outsource decisions you would otherwise make yourself, you have potentially compromised the work from the perspective of your own expert judgment. So putting your name on it betrays what you know to be true of good science. Of course, you could check everything down to the lowest level, and intervene whenever the agent tries to do something you don’t agree with. Then the AI is really just a means of computation–even if you use it for brainstorming what research questions to ask, you could view it as a way of extending your limited resources but without sacrificing your own scientific judgment. This requires that you are knowledgeable enough to assess everything it does. Assuming you are, then it seems hard to argue that this is fraud. Though admittedly, a lot rests on how careful you are when you check things over.

Part of the concern may be that AI makes it tempting to extend your methods or claims outside of what you know well. Without the option of using it, you would have had to do the research yourself (and presumably gain understanding in the process) in order to apply that method. Relatedly, I suspect most people would agree that when you are in a training context, like taking courses in grad school, turning in work that was largely driven by the AI is a form of fraud because it holds you back from gaining the understanding yourself. 

Another version of the fraud argument focuses on misattribution of ideas. Maybe this is what Andrew had in mind when he wrote “publishing textbook excerpts as if it were new research is fraud, or copying from wikipedia as if it were new research is fraud.” If the AI produced significant parts of the contribution, like the specific hypothesis, the choice of methods, and the framing of the contributions, then those aren’t your novel creations and adding yourself as author is misattribution. But this is tricky because human scientists also recombine existing ideas, methods, and frames constantly, and we often call the resulting combinations original contributions. So the question isn’t whether ideas are derivative per se but when the degree of derivation crosses into infringement. We have copyright law for some creative domains, and we can try to formulate when AI outputs are permissible in light of this (e.g., Annie Liang has some recent work on this). But to say AI-assisted papers are fraudulent on such grounds, we’d need to work out what the scientific analog of substantial similarity is. This is hard because science explicitly values building on prior work. But we can agree on some things, like you shouldn’t do something that is too close to others’ work without citing them.

A final angle on why its fraud might be that it misportrays what science is more broadly. If you think the reasons to do science are fundamentally human–that as scientists we are concerned with producing understanding for ourselves just as much as we’re concerned about improving things in the world–then you could argue that for science to be meaningful we have to be the ones coming up with the ideas and shaping them as we go. From this perspective, automated science isn’t inherently wrong, it’s just missing the point. AI for science arguments often completely overlook the “people production” role of science. In the extreme, they envision AI finding solutions for lots of real world problems and intervening to control outcomes in the world without us understanding how any of it works. In reality, the personal side–including the search for personal fulfillment through science–is a big part of why smart people who could potentially make a lot more money in applied roles end up choosing research careers. And it’s a big part of how we evaluate scientists. How many of your Ph.D. students have gone on to competitive research positions? What does the trajectory of topics you’ve worked on say about your research taste? 

Pushback to this argument might point out that by saying science is entirely a matter of human careers, we contradict claims that we as scientists like to make, about how we are dedicated to improving the state of the art in our field, or producing value for the world. Would it still be science as we know it if we started acknowledging that it’s really about personal fulfillment for scientists? But I think this is a bit of a false dichotomy. The public value of science depends on there being humans who find the work meaningful enough to do it well, including pushing back on sloppy results, exercising their taste to shape the direction in their field, training students worth training, etc. Careless AI use can threaten this by flooding the system with outputs that crowd out careful work, disincentivizing the people who would be intrinsically motivated to do quality work less likely to stick around. It also implicitly reframes science as nothing more than a pipeline for results.

My view is that AI use can go either way, depending on how you approach it. What best determines whether it’s fraud or not is the attitude you bring. It can help you do less fraudulent research if you’re the kind of person who is already very picky about what you send out to the world. But it can help you fool yourself and others if you let competitiveness and obsession with metrics drive how you use it.

As a final comment, there’s some irony in using terms like “optimism” to talk about this. I described the pro-automated science argument above as “optimistic,” because I think that’s how many in this camp see themselves–as fundamentally optimistic about the future of science and our ability to improve it by using AI. But the underlying motivation to figure out how to produce papers with as little human oversight as possible is also often deeply pessimistic. A common narrative is the “review death spiral”: AI production stresses the review system, which increases the noisiness of paper acceptance decisions, which further incentivizes submitting sloppy AI produced papers. The answer is presumed to be putting more AI in place on both sides. The idea that scientists have agency and could continue to shape the meaningfulness of what gets produced starts to seem out of the question.

Increasingly, a lot of the most enthusiastic pro-AI discourse (including for science) strikes me as nihilism masquerading as optimism. We have people who perceive themselves as huge optimists that will reshape science or society for the better simultaneously lacking the imagination to see beyond their own technological determinism. It reminds me a bit of the “optimism” associated with some open science and science reform positions, who also suggest that we just need the right technology to fix the problems (though in this case, its heuristics like replication or preregistration). It’s a fundamentally non-agentic view of human scientific endeavor.