AI Takes a Leading Role in Scientific Research at Groundbreaking Conference
In an unprecedented scientific experiment, researchers from diverse fields gathered at the virtual Agents4Science 2025 conference on October 22 to explore a bold question: How effective can artificial intelligence be as a scientific collaborator? Unlike traditional academic conferences, this innovative event specifically welcomed submissions where AI did most of the heavy lifting—from generating hypotheses to analyzing data and even conducting initial peer reviews.
The conference, co-organized by Stanford University computer scientist James Zou, represented what he called “an interesting paradigm shift” in how scientists approach research. “People are starting to explore using AI as a co-scientist,” Zou explained. The event served as a controlled experiment to evaluate AI’s capabilities in scientific contexts, with all materials made publicly available for further study. This approach stands in stark contrast to most scientific journals and conferences, which typically ban AI co-authorship and prohibit peer reviewers from using AI tools—policies intended to prevent hallucinations and other AI-related issues that might compromise scientific integrity.
The conference showcased 48 papers out of 314 submissions spanning economics, biology, engineering, and other disciplines. Each accepted paper required detailed documentation of how humans and AI collaborated throughout the research and writing process. One notable example came from Min Min Fong, an economist at the University of California, Berkeley, whose team partnered with AI to analyze car-towing data from San Francisco. Their study revealed that waiving high towing fees significantly helped low-income individuals retain their vehicles. Fong acknowledged AI’s computational strengths but emphasized the need for human oversight: “You have to be really careful when working with AI,” she noted, citing an instance where the AI repeatedly referenced an incorrect implementation date for San Francisco’s fee-waiving policy—an error she discovered only by consulting the original source materials.
The quality of the AI-assisted research received mixed reviews from evaluators like Risa Wechsler, a computational astrophysicist from Stanford. While she found the papers technically sound, she described them as “neither interesting nor important,” suggesting that current AI systems still lack the ability to “design robust scientific questions.” Wechsler cautioned that AI’s technical proficiency might sometimes “mask poor scientific judgment,” highlighting the continuing need for human expertise in formulating meaningful research questions and interpreting results within broader scientific contexts.
Despite these limitations, some participants discovered promising applications for AI in the scientific process. Silvia Terragni, a machine learning engineer at Upwork in San Francisco, reported that after providing ChatGPT with background information about her company’s challenges, the AI proposed several paper ideas—one of which became one of the conference’s top three papers. Her team’s winning submission examined AI reasoning in job marketplaces, leading Terragni to conclude that AI “can actually come up with novel ideas” when properly guided.
The Agents4Science conference represents an important milestone in understanding both the potential and limitations of AI in scientific research. While current AI systems excel at computational tasks and can occasionally generate innovative concepts, they still require significant human guidance, fact-checking, and scientific judgment. As Fong succinctly put it, “The core scientific work still remains human-driven.” Nevertheless, the conference demonstrated that when humans and AI collaborate effectively, they can produce meaningful research that might point toward a future where artificial intelligence becomes an increasingly valuable partner in scientific discovery.


