Pushback on use of hidden prompts to snare AI peer reviews | The Transmitter: Neuroscience News and Perspectives
Secret messages: Invisible instructions embedded in a document direct an LLM to use telltale phrases in a peer-review report.
Photograph by<br>Richard Drury
News
Publishing
Add us as a Preferred Source on Google
Set us as a Preferred Source to see The Transmitter more prominently in your Google Search results.
Scientists decry conference’s use of hidden prompts to snare AI peer reviews
The invisible messages, which instruct large language models to use telltale phrases in a peer-review report, are effective in catching artificial-intelligence misuse but also erode trust, some say.
By
Dalmeet Singh Chawla
1 July 2026 | 4 min read
comments
Dalmeet Singh Chawla
Contributing writer
Share this article:
Facebook - opens a new tab
Linkedin - opens a new tab
Mail - opens a new tab
Share more - opens a new tab
Tags:
Publishing,
Artificial intelligence
Organizers of a prominent neuroscience conference are facing pushback on social media after adding hidden prompts to their papers to catch peer reviewers who are using generative artificial intelligence (AI) to referee papers.
The 40th Annual Conference on Neural Information Processing Systems (NeurIPS)—which is slated to take place in Sydney, Australia, in December 2026—bans peer reviewers from uploading papers they referee to AI chatbots, as the practice breaches confidentiality. Reviewers can still use AI chatbots for background research purposes, according to the policy outlined in the conference’s handbook.
Most read
Supported by a $40 million NIH grant, Yale brain shuttle technology raises questions
By
Natalia Mesa
Exclusive: Janelia sunsets rodent work, launches transparent fish project
By
Calli McMurray
Exclusive: Neuroscience journal editor resigns over automation concerns
By
Dalmeet Singh Chawla
To enforce the policy and catch illicit AI use in peer review, the event’s organizers have included deliberately concealed instructions for large language models (LLMs) in papers sent out for peer review.
The instructions tell an LLM to use telltale phrases—such as “This work addresses the central challenge” and “The claims of the paper”—in a peer-review report. Some researchers have already been caught trying to sneak secret messages into their papers in a bid to game AI tools into giving them favorable referee reports. Many publishers ban the use of AI in peer review.
Multiple researchers refereeing papers for NeurIPS have taken to social media to express their concerns about the indirect prompt injections inserted into papers.
“Designing a trap that presumes bad faith corrodes the relationship the whole system depends on,” Sören Auer, a computer scientist at Leibniz University Hannover, wrote on LinkedIn. “You do not build a healthy reviewing culture by treating your reviewers as suspects.”
But others see merits in the approach. A similar prompt-injection effort has caught hundreds of reviewers misusing LLMs in submissions for next week’s 43rd International Conference on Machine Learning (ICML 2026) in Seoul, South Korea, according to Nihar Shah, a computer scientist at Carnegie Mellon University and scientific integrity chair of that conference.
In a statement to The Transmitter, the NeurIPS organizing committee says it can’t discuss injected prompts in detail “without eroding the effectiveness of this intervention.”
uer told The Transmitter he was assigned eight NeurIPS papers to review. He says he sometimes converts PDF files into Microsoft Word documents when carrying out peer review, which renders some prompts visible.
You do not build a healthy reviewing culture by treating your reviewers as suspects.
Sören Auer
Auer says he initially rejected the first paper he was reviewing because he thought the prompts had been inserted by the study’s authors. But he removed the flag after discovering hidden prompts in a second paper and seeing researchers discussing this issue on a Reddit thread.
It’s possible that more papers are being rejected because referees don’t know that prompts were inserted by conference organizers, he says. “I personally think it’s not good to prohibit the use of AI,” Auer adds. “We should rather, of course, have a discussion on how to use it.”
The NeurIPS committee has been replying directly to any reviewer who has noticed the hidden prompts, informing them not to penalize individual papers, according to the statement.
Like Auer, Sara Atito, an AI researcher at the University of Surrey, told The Transmitter she spotted the same prompt in all four papers she reviewed for NeurIPS. She says she also found it in the version of her own paper that NeurIPS organizers created before sending the paper out for peer review.
Atito calls hidden prompts a “poor mechanism,” arguing that it may filter out some problematic submissions but won’t solve the bigger problems with peer review. “We put too...