At most scientific publications, papers co-authored by artificial intelligence (AI) are not welcome. At a new open platform called aiXiv, they are embraced. The platform goes all in on AI: It accepts both AI- and human-authored work, uses built-in AI reviewers for baseline quality screening, and guides authors through revisions based on the chatbots’ feedback. “AI-generated knowledge shouldn’t be treated differently,” says Guowei Huang, one of aiXiv’s creators and a Ph.D. candidate specializing in AI and business at the University of Manchester. “We should only care about quality—not who produced it.”
The platform is still at an early stage; after a mid-November update, it hosts just a few dozen papers and early-stage proposals. But many researchers say it promises a welcome reprieve for the overloaded human peer-review system, which has been forced to shoulder the ongoing surge of papers driven by both legitimate and banned use of AI.
“It’s extremely important that the automated science community take responsibility for how they are going to evaluate their own research,” says Thomas Dietterich, an emeritus professor of computer science at Oregon State University.
However, Dietterich and others caution that aiXiv and other AI-friendly experiments will inevitably have to contend with long-standing challenges in science such as fraudulent research and superficial peer review. “These models continue to become better and better mimics of what scientific research looks like, but not necessarily better and better scientists,” he says. “How will they ensure that the research is real?”
Right now, there’s little agreement among science publishers about how to handle AI, which is boosting the output of papers in many fields. Many journals (including Science) still ban AI-generated manuscripts outright. Others permit AI-assisted writing but require disclosure.
Preprint servers are also feeling the crush. Citing an uptick of suspect papers, arXiv announced in October it would no longer host computer science review and position papers unless they had already undergone peer review. “Automated, AI-written documents really undermine the arXiv model,” says Dietterich, who chairs the server’s computer-science section.
Others are turning to the tools themselves. Facing a surge of papers likely aided by AI, openRxiv, the nonprofit that runs bioRxiv and medRxiv, said last month it would add an AI review tool to rapidly generate feedback on its preprints.
But these servers, like most conferences and journals, still bar naming AI systems as authors—a stance that inadvertently pushes researchers to use AI without saying so. Huang calls that lack of transparency “totally unacceptable.”
That’s one of the reasons he joined forces with collaborators from institutions including the University of Toronto (UToronto), the University of Oxford, and Tsinghua University to create aiXiv. The platform-which also has partnerships with tech firms such as Singapore-based Bohrium and China-based DP Technology—bills itself as the first “structured peer-review and iterative publishing infrastructure for the AI-era of scalable scientific output.”
After a submission comes in, five “agents”—large language models tuned to autonomously complete tasks—assess its novelty, technical soundness, and potential impact. The system includes defenses against foul play: For example, it can detect if authors try to smuggle hidden instructions into manuscripts to elicit favorable reviews. If three of the five agents recommend acceptance, the work is posted. Based on the agents’ feedback, authors can then revise and resubmit for another review cycle. Preliminary tests, the authors claim in an August preprint, suggest these iterative loops improve the quality of AI-generated papers. AiXiv’s infrastructure can support thousands of submissions, Huang adds, and it typically generates reviews in 1 or 2 minutes as compared with months or years for conventional peer review.
So far, the site has a mix of submissions, ranging from unorthodox physics theories to a study using “thermal comfort tests” to assess AI’s perceptual ability. According to Pengsong Zhang, an AI researcher at UToronto, the site isn’t formally “accepting” papers right now. Rather, all submissions are being posted as the platform experiments with refinements to its AI reviewers.
Early experiments from elsewhere suggest how difficult this problem may be. Agents4Science, an October conference that was the first to cast AI as both first authors and reviewers, shows mixed results: Reviewer systems were good at catching numerical inconsistencies or mismatched references, but often offered overly rosy views of a paper’s novelty or impact. And compared with a tightly structured conference, a preprint server brings “a lot more heterogeneity and slop that can be harder to catch,” says Agents4Science organizer James Zou, a biomedical data scientist and AI researcher at Stanford University.
People will be closely watching to see whether aiXiv can avoid those pitfalls. “They must be vigilant to ensure aiXiv does not become a dumping ground,” adds Sebastian Porsdam Mann, a bioethicist at the University of Copenhagen. “If the platform becomes associated with low-quality volume over scientific rigor, it will delegitimize the entire field of AI-led inquiry and throw any good AI research under the bus.”
AiXiv’s founders remain optimistic. Zhang says the team has collected yet-to-be-published data comparing aiXiv’s agent reviews with human-written reviews of 30 papers for a large robotics conference held in October. According to Zhang, the AI reviewers largely aligned with human scores and, in some cases, provided more detailed feedback.
If nothing else, “a lot of the issues raised about AI review and AI science are true also of the current system,” Porsdam Mann says. “AI just forces the issues.” Carnegie Mellon University computer scientist Nihar Shah agrees, saying aiXiv’s arrival is simply accelerating conversations about how to better “track provenance and certify in a verifiable manner that people [or AI] have actually done the experiments.”
As Shah braces for the “almost inevitable outcome of AI doing a lot of research,” he says the questions aiXiv raises aren’t going anywhere. “Whether this platform flies or not, they have already started a wonderful discussion. That already is quite commendable.”