ICML Rejects Papers of Reviewers Who Used LLMs

Matthew Diakonov·March 18, 2026·2 min read

academia llm-detection peer-review watermarking ai-agents

ICML Rejects Papers of Reviewers Who Used LLMs

The academic review system runs on a fragile assumption - that human experts read and evaluate papers. When reviewers outsource their reviews to LLMs, the entire system breaks. ICML's response highlights a detection problem with no clean solution.

The Two Detection Approaches

Prompt injection watermarks embed hidden instructions in papers that trigger predictable LLM responses. If a review contains the watermarked phrase, the reviewer used an LLM. This is clever but brittle - it only catches reviewers who paste the full paper into a model. Anyone who paraphrases or uses the LLM for parts of the review evades detection.

Statistical detection analyzes writing patterns - word frequency, sentence structure, vocabulary distribution. This catches more cases but has false positives. A non-native English speaker who writes in a style that happens to resemble LLM output gets flagged. A skilled prompt engineer who instructs the LLM to write naturally gets through.

Why This Is an Agent Problem

This is not just about academia. It is about any system that depends on verified human judgment:

Code reviews - is the reviewer actually reading the diff?
Legal reviews - is the lawyer actually analyzing the contract?
Medical second opinions - is the doctor actually examining the case?

When agent-assisted work becomes indistinguishable from human work, systems built on the assumption of human effort need redesigning - not better detection.

The Real Fix

Instead of detecting whether an LLM was used, verify the quality of the output. A brilliant review is valuable whether a human or an LLM wrote it. A shallow review is worthless either way. Judge the work, not the tool.

Fazm is an open source macOS AI agent. Open source on GitHub.

ICML Rejects Papers of Reviewers Who Used LLMs

ICML Rejects Papers of Reviewers Who Used LLMs

The Two Detection Approaches

Why This Is an Agent Problem

The Real Fix

More on This Topic

Related Posts

Clawdbottom Creative Writing Workshop

AI Agent News April 2026: Claude Code, OpenClaw, and the Agent Infrastructure Race

AI Agents vs Copilot: When to Let AI Drive vs Ride Shotgun