A history essay written in formal, period-appropriate prose can read to an AI detector like machine output. Students at UC Davis and other universities have reported exactly this pattern: careful academic writing, especially on historical topics, getting flagged at high confidence by tools the institution treats as authoritative. The accusation feels surreal because the writing style that triggered it is the same style the assignment asked for.
The accusation pattern
The typical sequence is consistent across the cases that have surfaced publicly. A student submits a humanities or history paper. The instructor runs it through Turnitin's AI writing indicator or a third-party tool such as GPTZero or Originality.ai. The score comes back high. An academic integrity referral follows, often with the detector score as the primary or only piece of evidence.
What is striking is how often the flagged writing shares the same characteristics: formal register, longer sentences, restrained vocabulary, period-appropriate phrasing when discussing primary sources, and minimal contractions or colloquialisms. These are the markers of competent academic prose. They are also, by the statistical metrics detectors rely on, the markers of low perplexity.
Why archaic phrasing trips detectors
AI detectors estimate the probability that each word in a passage was the most likely next word given the words before it. Text that uses predictable, common phrasing scores low on perplexity, and low perplexity is what most detectors flag as AI-likely. A student writing about the French Revolution who uses phrases like "the prevailing sentiment among the Third Estate" or "in the aftermath of the Estates-General" is producing exactly the kind of measured, formulaic phrasing that drives perplexity down.
Quoting or paraphrasing primary sources compounds the problem. Historical documents have been ingested by every major language model during training. When a student paraphrases an 18th- or 19th-century source, the resulting prose can statistically resemble the kind of text those same models now produce. The detector cannot tell the difference between a careful student paraphrase and AI-generated content built from the same source material.
What the research says about writing style
The 2023 Stanford study by Liang et al., published in the journal Patterns, demonstrated that detectors flag non-native English writing at sharply higher rates than native-speaker writing, largely because non-native writers tend to use simpler, more predictable structures. The same statistical logic applies to formal academic prose. The Weber-Wulff et al. study (2023), published in the International Journal of Educational Integrity, tested fourteen detection tools and concluded that none was reliable enough for institutional decision-making.
Neither study singled out history essays, but the mechanism is the same: any writing style that favors clarity, formality, and restrained vocabulary produces text the detectors associate with AI. For a deeper look at the bias against formal and non-native writing, see our summary of the Turnitin research on non-native English speakers.
Evidence that matters in a history-essay case
The strongest defenses in essay-flagged cases combine process evidence with a direct critique of what the detector actually measured. Useful documentation includes:
- Version history from Google Docs or Microsoft Word, showing the essay built up over hours or days with edits, deletions, and revisions
- Research notes, library checkouts, database access logs, and bookmarked sources that map to the citations in the paper
- Earlier writing samples in your own voice on similar formal topics, to establish that the flagged style is consistent with your prior work
- The specific detector name, version, and score, plus any threshold the institution claims is meaningful
- Comparison runs of the same essay through other detection tools, which frequently disagree
What to do if your essay was flagged
Move quickly on process documentation before any of it disappears. Export your document's version history, save your browser bookmarks and library access records, and write a timeline of how you researched and drafted the paper while it is fresh. Request, in writing, the detector score, the tool name, and any policy that defines what threshold the institution treats as significant.
Read your school's academic integrity policy carefully. Most policies require evidence of a violation, not merely a probabilistic flag. The procedural rights FAQ covers what you are typically entitled to request before a hearing. If you are drafting a written response, NotBot generates a personalized defense package that addresses your specific detector, documents your writing process, and cites the relevant research on detection limitations. If you have already had a hearing and the outcome went against you, the appeal package is built around the procedural and evidentiary grounds appeals tend to turn on.
If the potential consequences include suspension, expulsion, or visa impact, consult an education law attorney in your jurisdiction before your hearing.
Build your defense package
A personalized response letter and evidence guide built around the detector that flagged your essay.
Get your defense package$49 one-time · Generated in 60 seconds