AI content detectors are most confident on clean machine output, but that is rarely what anyone submits. Once a human edits even a fifth of the words, accuracy drops sharply and the resulting score is an estimate, not a verdict. Understanding why that happens changes how much weight you should give any detector result.
Why Editing Breaks the Signal
Detectors do not read for meaning. They read for pattern. Most measure perplexity, a score of how predictable each word is given the words around it. Machine text tends to be smooth and predictable. Human text wanders and surprises itself.
Editing roughens that flatness. When a person swaps a word, splits a sentence, or reorders a paragraph, they scuff the surface the detector was scanning. The underlying structure may still be machine-shaped, but the telltale smoothness is gone.
Edited drafts are the hard case because the detector was taught to sort two clean piles. Edited AI text is a third pile it never learned.What the Benchmarks Show
The accuracy numbers tell a consistent story. Raw AI output gets flagged at high rates. Edited or paraphrased AI text drops detector accuracy to roughly 60 to 85 percent.
A study of five commercial detectors, testing AI-generated replicas of nearly 6,000 research papers, reported false negative rates from 0.3% all the way to 99.6%. The same family of tools, the same kind of text, and results that scatter from nearly perfect to nearly useless.
Before trusting any detector verdict, check whether it publishes accuracy figures for edited text, not just raw AI output. A detector that only proves itself on clean AI is answering a question almost no one is really asking.