AI Writing

The distinctive properties of text generated by large language models (LLMs), and why they matter for critical readers.


Core Properties

1. Hedging and Qualification Saturation

AI over-qualifies claims, rarely committing without escape hatches:

  • “It’s worth noting that this is a complex issue with many perspectives…”
  • “While there are certainly valid arguments on both sides…”

This creates an illusion of nuance while actually avoiding commitment to any position.

2. Formulaic Structure

Predictable introduction-body-conclusion with numbered lists or bullet points, even when the content doesn’t call for it. AI defaults to structured formatting as a substitute for clear thinking.

3. Sycophantic Framing

Leading with validation before substance:

  • “That’s a great question!”
  • “You raise an excellent point about…”

This pattern reflects reinforcement learning from human feedback (RLHF), where models are trained to be agreeable.

4. Flattened Register

A consistent mid-formal tone regardless of topic. A poem, a code review, and a cooking recipe all sound like they were written by the same middle-manager. Human writing naturally shifts voice, formality, and rhythm depending on context and audience.

5. Semantic Emptiness

Phrases that sound substantive but carry no information:

  • “In today’s rapidly evolving landscape…”
  • “It’s important to remember that…”
  • “This is a multifaceted issue…”

These are filler — they occupy space without advancing an argument.

6. Exhaustive Completeness

AI tries to cover every angle rather than making editorial choices about what matters most. A human expert omits the obvious; AI includes everything it can generate. This reflects a fundamental difference: writing is as much about what you leave out as what you include.

7. Symmetric Treatment of Asymmetric Claims

Treating a well-evidenced position and a fringe one as equally weighted: “Some argue X, while others argue Y” — even when X has overwhelming support and Y doesn’t. This false balance distorts epistemic reality.

8. Absence of Genuine Uncertainty

AI rarely says “I don’t know” or “the evidence is too thin to say.” Instead it produces confident-sounding text that papers over gaps. The appearance of knowledge without the substance.

9. Characteristic Lexical Choices

Certain words appear at far higher frequency in AI text than human text. These have been called “aidiolects” — vocabulary fingerprints specific to different models:

  • Common across models: delve, nuanced, multifaceted, landscape, leverage, foster, holistic, robust, tapestry, underscore, pivotal
  • The em-dash appears with remarkable frequency, dubbed the “ChatGPT dash” — used as a universal connector between ideas

10. Formatting Overuse

Bolding, headers, bullet points, and markdown deployed reflexively rather than purposefully. Structure as a substitute for clarity.


The Meta-Signal

The most reliable indicator isn’t any single property — it’s the combination of high fluency with low editorial judgment. Human writers make choices about what to include, what tone to strike, what to leave unsaid. AI produces everything it can, polished but undiscriminating.

Stylometric research confirms this: AI text exhibits balanced, neutral phrasing with predictable rhetorical patterns, while human writing shows a broader range of narrative voice, personal expression, and natural variability.


Linguistic Research Findings

Academic analysis has identified several measurable differences:

  • Lexical diversity — AI text tends toward lower lexical diversity, reusing vocabulary more predictably
  • Epistemic markers — AI-generated academic text lacks the depth of epistemic hedging found in human scientific discourse (paradoxically, it hedges socially but not epistemically)
  • Syntactic patterns — Significant differences in use of adjectives, pronouns, modifiers, and sentence complexity
  • Informality — AI-generated text uses informality features differently from human-authored text, even when attempting casual tone
  • Consistency — AI maintains remarkably uniform style throughout a document; humans show natural variation including occasional awkward phrases and tone shifts

Detection Limitations

No AI detector is fully reliable. The best tools correctly identify AI-generated text roughly 80% of the time, with significant false positive rates. The boundary is blurring in both directions as:

  1. Models improve at mimicking human variation
  2. Humans increasingly write with AI assistance
  3. Some humans naturally write in ways that overlap with AI patterns

Human judgment and contextual analysis remain more reliable than automated detection.


Why This Matters

The properties above aren’t just detection heuristics — they represent epistemic risks. Text that hedges everything, treats all claims symmetrically, and never admits uncertainty isn’t just identifiable as AI; it’s actively misleading about the state of knowledge. Critical readers should watch for these patterns regardless of whether the author is human or machine.


Sources