Session 1.5: 15 Forensic Markers of AI-Generated Content (Part 1)

Course → Module 1: What Makes Slop, Slop

Session 5 of 10

Identifying AI-generated content is not about gut feeling. It is a diagnostic skill. Like a mechanic listening to an engine or a doctor reading a blood panel, detection relies on identifying specific, nameable markers. This session covers the first eight.

Marker 1: The "Comprehensive Guide" Opening

AI articles disproportionately open with framing that promises comprehensiveness. "In this comprehensive guide, we'll explore..." or "This definitive guide covers everything you need to know about..." The pattern signals that the AI has been prompted for a long-form article and is dutifully announcing its scope before delivering content.

Human writers rarely announce that their article is comprehensive. They just write the article and let the reader judge its scope.

Marker 2: Tricolon Abuse

A tricolon is a rhetorical device using three parallel elements: "efficient, effective, and engaging." AI uses tricolons constantly because they appear in training data at high frequency (speeches, marketing copy, inspirational content) and because they are structurally simple to generate. The problem is not the tricolon itself. The problem is density. When every other sentence deploys one, the text sounds like a graduation speech.

Marker 3: The False Bridge

"But here's the thing..." "Here's what most people miss..." "The truth is..." These transitions promise a reveal. A contrarian insight, a hidden truth, a perspective shift. In AI text, the sentence after the bridge is almost always a restatement of common knowledge, not an actual insight. The bridge is a formatting pattern borrowed from persuasive writing, deployed without the substance that makes it work.

Marker 4: Premature Summarization

"In summary" or "To sum up" appearing in paragraph 3 of a 10-paragraph article. AI models track output length imperfectly and sometimes signal closure long before the content is actually finished. The result is a text that appears to wrap up, then continues for another 700 words, often repeating what was just "summarized."

Each forensic marker is a pattern the model learned from training data and deploys without the context that makes it effective. The marker is a shadow of a writing technique, used without understanding.

Marker 5: The Hollow Metaphor

AI uses metaphors frequently. "Think of your database as a library." "Your code is like a recipe." These metaphors sound explanatory but often break down immediately under scrutiny. A library has a card catalog, a Dewey Decimal system, a librarian, physical shelves. If the metaphor does not map to these details, it is not illuminating the concept. It is decorating it.

The hollow metaphor is distinct from a useful metaphor. A useful metaphor extends across multiple levels of comparison. A hollow metaphor makes one surface-level connection and abandons the mapping.

Marker 6: Over-Attribution

"According to experts..." "Research suggests..." "Studies have shown..." These attributions sound authoritative. They cite no specific expert, no specific study, no specific research. This pattern exists because RLHF rewards responses that appear well-sourced, but the model cannot actually cite sources (it does not have access to a bibliography). The result is attribution theater: the form of citation without the substance.

Marker 7: The Enthusiasm Spike

In otherwise flat, measured prose, a sudden exclamation mark appears. "This approach can truly transform your results!" The spike is jarring because it breaks the established tone. It occurs because the model shifts between different tonal registers learned from different training data: the measured register of informational content and the enthusiastic register of marketing copy.

Marker 8: Synonym Cycling

AI avoids using the same word twice in close proximity. This is a learned pattern from writing advice ("vary your vocabulary"). In practice, it produces sentences like: "The methodology was effective. This approach proved successful. The technique demonstrated its value." Three sentences saying exactly the same thing with different synonyms. The cycling creates an illusion of depth while adding zero new information.

Marker Reference Table

#	Marker	Pattern	Why AI Does It
1	Comprehensive Guide Opening	"In this comprehensive guide..."	Prompt-following for long-form requests
2	Tricolon Abuse	"Efficient, effective, engaging"	High-frequency pattern in training data
3	False Bridge	"But here's the thing..."	Borrowed persuasive structure without substance
4	Premature Summarization	"In summary" at paragraph 3/10	Imprecise output length tracking
5	Hollow Metaphor	"Think of it like a library..."	Surface-level pattern matching on explanatory text
6	Over-Attribution	"Studies show..." (no citation)	RLHF rewards authoritative-sounding claims
7	Enthusiasm Spike	"This is truly amazing!"	Tonal register mixing from diverse training data
8	Synonym Cycling	Method/approach/technique in 3 sentences	Vocabulary variation learned from writing advice

graph LR A["Read AI text"] --> B["Scan for markers 1-8"] B --> C{"Count instances"} C -->|"0-5 per 1000 words"| D["Likely human-written
or well-edited AI"] C -->|"6-15 per 1000 words"| E["Moderate AI markers
Light editing applied"] C -->|"15+ per 1000 words"| F["Unedited AI output
Standard slop specimen"]

These eight markers are not individually conclusive. Any human writer might use a tricolon or a bridge phrase occasionally. The diagnostic power comes from density. One tricolon in a thousand words is rhetorical skill. Five tricolons in a thousand words is a model following a pattern. The next session covers markers 9 through 15.

Assignment

Take a 2,000-word AI-generated article on any topic.
Using markers 1 through 8, annotate the text. Mark every instance with a label (M1, M2, M3, etc.).
Count the total markers found. Calculate the marker density (markers per 1,000 words).
Create a table: Marker Number | Instance Found | Location (paragraph #) | Notes.
If you find more than 15 instances across all 8 markers in 2,000 words, you have a standard specimen of slop.

15 Forensic Markers of AI-Generated Content (Part 1)