How to Detect AI-Written Text: Language Markers

Why we know this

We built our own AI detector — here is what we learned

When AI texts flooded the SERPs, we had to check client content by hand. Paid services (Copyleaks, Originality.ai) helped but never explained the "why". Once we understood their logic, we realized they analyze text with… AI itself. So we built our own free AI detector that doesn't just give a verdict — it shows where and how the AI left its traces.

This article distills what we learned from thousands of checked texts. From our observations across client projects:

85%of AI texts contain the template marker phrases listed belowSEOquick observations

90%of AI texts have perfectly even grammar and punctuation — humans don't write like thatSEOquick observations

~25%of sites that came to us with traffic drops had lost it after mass-publishing raw AI contentSEOquick project stats

Important: Google doesn't punish AI as such — it punishes useless content created to manipulate rankings. Checking text for AI is a check for "rawness", not a witch hunt.

Google's official position is in its AI content guidance: quality matters, not the production method. But raw machine text is recognizable — by algorithms and by readers. Here is exactly how.

Level 1 · language

GPT language markers: the phrases that give the machine away

GPT builds text from statistically frequent constructions. It sounds human, but over a paragraph the recognizable clichés emerge. The most common openers:

In today's worldIt's no secret thatIt is worth noting thatIt is important to understandOne of the key aspects isOne should considerThusThis article explores

The second group is bureaucratic glue AI starts sentences with: given that, within the framework of, in the context of, based on, with regard to, in accordance with. The third — empty generalizations GPT uses to fill a paragraph when it has no facts:

There are many ways toEach case requires an individual approachThere is no definitive answerSeveral factors must be consideredThis is especially relevant in

If such constructions appear back-to-back and there are no concrete facts, numbers, or examples — you are almost certainly looking at raw AI. How to add facts properly — see our copywriter brief guide.

GPT vs human: the comparison table

Trait	GPT	Human
Structure	Perfectly logical: thesis → argument → conclusion	Can be loose, improvised
Tone	Polite, academic, judgment-free	Emotional, personal, with humor
Transitions	Explicit connectors: "nevertheless", "thus"	Often intuitive, unmarked
Mistakes	None	Present — sometimes deliberate
Paragraphs	Same length, symmetrical	Uneven: from one line to a wall of text
Arguments	Always "by the textbook", no digressions	Sometimes illogical yet convincing

Level 2 · syntax

Machine logic: the rule of three and perfect symmetry

Even if you ban GPT's pet phrases, you can't fool the logic. Humans mess up: odd constructions, missing commas (my editor will confirm). GPT doesn't. Hence three stable patterns:

The rule of three. "Useful, structured, and fact-based" — GPT adores splitting ideas into threes: three adjectives, three bullets, three blocks under every heading (intro → explanation → conclusion).
Structural symmetry. Same-length paragraphs; each opens with a lead-in and closes with a bridge to the next. We asked GPT itself why — it answered: "I build text like a well-structured article, by the textbook."
Excessive politeness. Instead of "this doesn't work" — "some users may find this approach insufficiently effective under certain conditions". Bluntness, humor, and doubt are human; neutral diplomacy in every sentence is machine.

The visual "rhythm" of text: GPT produces same-length blocks with identical connectors; humans produce a ragged, living paragraph pattern.

Suspect AI content is already dragging your site down?

Run a free audit: we'll find the problem pages and the traffic-recovery levers.

Check my site →

Level 3 · code

Special-character evidence: invisible to the eye, visible in code

The most reliable part of our system. Humans physically don't type these characters — GPT inserts them constantly. Open the text in HTML mode and look for:

— (—)

The em dash. Humans use 1–2 per text. GPT drops up to 19 per page.

“ ” (“ ”)

"Typographic" curly quotes. Almost never seen in real web copy — humans type plain "straight" quotes.

→ (→)

Arrow glyphs. A human draws an arrow the rustic way: hyphen + greater-than (->).

  (0xa0)

The non-breaking space. Authors type regular spaces and don't bother.

’ (’)

The "proper" apostrophe instead of the human '. Machine typography.

… (…)

The ellipsis character. Humans type three dots in a row...

 

The thin space. Most authors don't even know it exists.

© ® (© ®)

A human writes (c) or (R) — these symbols aren't on the keyboard.

Evidence in the markup

Perfectly closed tags. Every <p>, <li>, <div> closed to standard — without a single human slip.
Mechanical lists. <ul><li><p>Text</p></li></ul> instead of a simple <li>Text</li>.
<hr /> with the closing slash and horizontal divider lines between sections — GPT's signature, a "coming-out in front of Google".
data-start / data-end attributes in headings and lists — technical markup no human ever adds.

Automation

Detection tools: where to start

Manual marker analysis is the most accurate but slow. At scale, the combo "detector + spot manual checks" works best:

Unmiss AI Detectorfree

Our tool: paste the text → get not just a verdict but a breakdown of where and how the AI left traces. Built on the experience behind this article. Try it →

Copyleaks

One of the most accurate commercial detectors, supports many languages. Good for checking contractors at scale.

Originality.ai

The western market standard: AI detection + plagiarism in one report. Paid, optimized for English.

GPTZero

The popular academic detector: scores text "perplexity" and "burstiness". Free tier available.

Fair warning: every detector makes mistakes. Well-edited AI text passes, while dry human officialese gets "caught". A detector verdict is a reason for a manual check — not a sentence.

By the way, building your own tool today is easier than it seems — see our AI tool development service. And for using AI in SEO the smart way — our ChatGPT for SEO guide and the 50 mega-prompts collection.

Practice

The manual check checklist: 7 steps

Search for marker phrases. Ctrl+F: "in today's world", "it is worth noting", "thus". 3+ hits — yellow flag.
Fact check. Are there concrete numbers, names, examples? Generalizations without facts are AI filler's main sign.
Paragraph rhythm. Step back from the screen: if every paragraph looks identical — that's machine symmetry.
The rule of three. Count the triples: three adjectives, three bullets, three blocks per section.
Code audit. Open the HTML: — more than three times, curly quotes, data attributes, <hr />.
Run it through a detector. Unmiss / Copyleaks / GPTZero — for confirmation, not instead of your head.
The usefulness test. Google's key question: will the reader learn something the top-3 results don't offer? If not — it doesn't matter who wrote it.

The same approach works in reverse — to "humanize" an AI draft: remove the markers, add facts and personal experience, break the symmetry. How to write commercial copy that sells — in our commercial content article.

The 2026 context

Why it matters: AI text and visibility in Google and AI search

The paradox of 2026: AI search engines (AI Overviews, ChatGPT, Perplexity) themselves dislike citing raw AI content. They rely on sources with expertise, facts, and authority — we covered this in detail in our pieces on GEO optimization and backlink sources.

Raw AI text → templates, zero facts → never cited, risks falling under "scaled content abuse" in Google's spam policies.
AI draft + editor + facts + experience → full-fledged content that ranks and gets cited. Google doesn't care about the production method.

So checking text for AI is really a check of your content process. The detector catches not "AI" but the absence of human work on the text.

Takeaways

In short: a three-level system

Language: marker phrases, bureaucratic glue, fact-free generalizations.
Syntax: the rule of three, symmetrical paragraphs, excessive politeness, perfect grammar.
Code: special characters (—, “ ”, →,  ) and machine markup (data attributes, <hr />).
Tools speed things up but don't replace you: always verify a detector's verdict against the evidence above.
The goal isn't to "catch AI" — it's to never publish useless content: that's what Google penalizes and AI search ignores.

Data sources

Google — official position on AI content: Search and AI content; spam policies (scaled content abuse): spam policies.
Unmiss — our free AI detector with evidence breakdown: ai-content-detector.
Copyleaks — AI content detector; Originality.ai — originality.ai; GPTZero — gptzero.me.

The percentages at the top (85% template phrases, 90% perfect grammar, ~25% of sites dropping after AI spam) are SEOquick's internal observations across checked texts and client projects — practical reference points, not an academic study. The marker and special-character lists come from our work on the Unmiss detector.

How to detect AI text: markers, code-level evidence, and detection tools

We built our own AI detector — here is what we learned

GPT language markers: the phrases that give the machine away

GPT vs human: the comparison table

Machine logic: the rule of three and perfect symmetry

Suspect AI content is already dragging your site down?

Special-character evidence: invisible to the eye, visible in code

Evidence in the markup

Detection tools: where to start

Unmiss AI Detectorfree

Copyleaks

Originality.ai

GPTZero

The manual check checklist: 7 steps

Why it matters: AI text and visibility in Google and AI search

In short: a three-level system

Data sources

Keep reading

ChatGPT for SEO: the full guide

GEO: the future of SEO in 2026

50 mega-prompts for SEO

The copywriter brief

Commercial content

Where to get backlinks

Content that even AI cites