#2736: Why AI Flagged Your Em Dash

Punctuation isn't a fixed system handed down by grammarians. It's a two-thousand-year story of contraction, invention, and now AI suspicion.

Featuring
Listen
0:00
0:00
Episode Details
Episode ID
MWP-2897
Published
Duration
26:46
Audio
Direct link
Pipeline
V5
TTS Engine
chatterbox-regular
Script Writing Agent
deepseek-v4-pro

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

Punctuation is one of those things you only notice when it fails. A missing comma changes a sentence's meaning, and suddenly you're parsing the same line three times. But most people never think about where any of it came from — and the story they learned in school, that punctuation is a fixed system handed down by grammarians, is almost entirely wrong.

What actually happened is messier. For centuries, classical Greek and Latin used scriptio continua — blocks of capital letters with no spaces, no periods, no commas at all. The earliest punctuation marks weren't for readers; they were performance annotations for orators. Around 200 BCE, Aristophanes of Byzantium invented a system of three dots at different heights to indicate how long a speaker should pause. The comma and colon are literally named after breathing patterns.

The shift from scroll to codex between the second and fourth centuries made silent reading more common, and punctuation began migrating from performance cues to comprehension aids. Irish monks in the seventh and eighth centuries introduced word spacing into Latin manuscripts — they needed visual help parsing a non-native language, and their innovation made silent reading possible at scale.

The modern system crystallized with the printing press, particularly through Aldus Manutius in 1490s Venice. He systematized punctuation for print, invented or popularized the semicolon, and gave us the modern comma. But the punctuation inventory of the 17th and 18th centuries was enormous compared to today. Lost marks include the percontation point (for rhetorical questions), the irony mark, the pilcrow, and the dagger. Contraction happened through prescriptive grammar, mass education, and the typewriter's limited character set — which froze punctuation habits for a century.

That brings us to the em dash. It's been around since the 18th century, used to indicate breaks in thought or emphatic parentheticals. But large language models trained on formal writing reproduce em dashes at a higher rate than most human writers. Now editors are asking writers to reduce em dash usage because it "looks AI generated" — a bizarre inversion where one of the most human marks, conveying hesitation and speech rhythm, has become suspicious.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3
Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#2736: Why AI Flagged Your Em Dash

Corn
Daniel sent us this one — he's been thinking about punctuation history since we talked about silent reading and those old scripts that ran without any marks at all. He wants to know whether English punctuation has actually contracted over time, whether the marks we think of as standard emerged gradually or all at once, and there's a personal grievance in here too. Apparently he's been using em dashes for years and now they're being flagged as proof of AI generated text, which he finds irritating.
Herman
And the question is better than most people realize, because the story most of us got in school — that punctuation is this fixed system handed down by grammarians — is almost entirely wrong. What actually happened is messier and much more interesting.
Corn
By the way, today's episode is powered by DeepSeek V four Pro.
Herman
Welcome aboard, DeepSeek. Now, punctuation — where do you even start with something that's literally invisible when it's working?
Corn
That's the thing, isn't it? You only notice it when it fails. A missing comma changes a meaning, and suddenly you're parsing a sentence three times. But most people never think about where any of this came from.
Herman
The real answer is that punctuation emerged over about two thousand years, from multiple places, for completely different reasons, and the set we use today is radically smaller than what was available even two hundred years ago. So yes, Daniel's contraction instinct is correct. The inventory shrank enormously.
Corn
Let's start with the zero-punctuation world then. What did text actually look like?
Herman
Imagine a block of capital letters with no spaces between words, no periods, no commas, nothing. This is called scriptio continua, and it was the standard in classical Greek and Latin for centuries. You'd have something like THISISASENTENCEANDYOUHAVETOPARSEITBYREADINGITALOUD.
Corn
Which connects directly to what we discussed before — reading aloud was the decoding mechanism. Your voice found the boundaries your eyes couldn't see.
Herman
The earliest punctuation marks weren't for readers at all — they were performance annotations for orators. Aristophanes of Byzantium, around two hundred BCE, working at the Library of Alexandria, invented a system of three dots placed at different heights to indicate how long a speaker should pause. A dot at the top of the line — the periodos — marked a full stop. A dot in the middle — the kolon — was a medium pause. A dot at the bottom — the komma — was a short breath.
Corn
The comma and the colon are literally named after breathing patterns. That's wonderful.
Herman
The period literally means a completed circuit, a full cycle. But here's the key thing — these marks were for reading aloud, for public speaking. They weren't grammatical. They were rhythmic, almost musical.
Corn
When does that shift? When does punctuation become about meaning rather than breathing?
Herman
The big inflection point is the shift from scroll to codex — from rolled manuscripts to bound books — between the second and fourth centuries. Once you have a codex, you can flip pages, you can reference things non-linearly, and suddenly silent reading becomes more common. Punctuation starts migrating from performance cues to comprehension aids.
Corn
It's still not standardized. We're not talking about a system anyone agreed on.
Herman
Throughout the medieval period, punctuation was wildly inconsistent. Every scriptorium had its own conventions. Irish monks in the seventh and eighth centuries — and this is one of my favorite details — they're the ones who introduced word spacing into Latin manuscripts. They were copying texts in a language that wasn't their native tongue, and they needed the visual help parsing words. So they just started putting spaces between them.
Corn
Irish monks invented the space bar. That feels like something that should be on a t-shirt.
Herman
It genuinely changed everything. Word spacing made silent reading possible at scale. Before that, even literate people mostly read aloud or at least murmured. After spacing, you could scan silently, which is faster and — crucially — private. Your reading became your own.
Corn
We've got dots for pauses, we've got spaces, we've got monks experimenting. Where does the modern system actually crystallize?
Herman
The printing press. And one name in particular — Aldus Manutius, a Venetian printer working in the fourteen nineties and early fifteen hundreds. He's the one who really systematized punctuation for print. His typefaces included the comma, the period, the colon, the semicolon — which he essentially invented or at least popularized — and the question mark.
Corn
Let's pause on the semicolon. Daniel mentioned it as one of those idiosyncratic choices. What was Manutius actually trying to do with it?
Herman
He wanted something between a comma and a colon — a pause that's stronger than a comma but doesn't fully separate two independent clauses the way a period would. And he used it extensively in his editions of classical texts. The semicolon was born as a tool for managing complex, multi-clause sentences — the kind of sentence structure that was fashionable in Renaissance Latin and Italian.
Corn
Which means the semicolon was always a bit of a luxury item. It's for sentences that are already architecturally ambitious.
Herman
And that's why it's been controversial for five hundred years. People have been arguing about semicolons since semicolons existed. Kurt Vonnegut famously said the only thing a semicolon proves is that you went to college. But I think that's unfair — it's a tool for expressing a specific logical relationship between thoughts, and when it works, nothing else quite does the same job.
Corn
I use them sparingly. They feel like a spice that overpowers the dish if you're not careful.
Herman
That's a reasonable position. Now, Manutius also gave us the modern comma — he took Aristophanes' breathing marks and turned them into the curved symbol we recognize. And his typefaces included the first italic type, which he used partly as a space-saving measure but which also created a visual distinction between emphasis and body text.
Corn
One Venetian printer in the fourteen nineties basically defined what English punctuation would look like for the next five centuries.
Herman
Not just English. His conventions spread across European printing. But here's where Daniel's question about contraction gets really interesting. The punctuation inventory of the seventeenth and eighteenth centuries was enormous compared to today.
Corn
Give me examples. What did we lose?
Herman
The percontation point — a backwards question mark proposed in the fifteen eighties by Henry Denham, an English printer, specifically for rhetorical questions. The idea was that a rhetorical question isn't really a question, so it shouldn't use the same mark. It never caught on widely, but it was used in some printed works for a few decades.
Corn
I actually love that. There's a real difference between "what time is it" and "who really knows what time it is." Having a mark for that distinction makes sense.
Herman
Then there's the irony mark — proposed multiple times throughout history, most notably by the French poet Alcanter de Brahm in the late nineteenth century. He suggested a mark that looked like a backwards question mark to indicate that a sentence should be read with ironic intent.
Corn
Which would solve approximately forty percent of internet arguments.
Herman
And there have been serious proposals for an irony mark as recently as the twenty-tens. But none of them stuck. And that's the pattern — people keep inventing punctuation marks to solve specific communicative problems, and almost all of them fail to be adopted.
Corn
What about marks that actually did exist in common use and then died?
Herman
The pilcrow — the paragraph mark, that backwards P symbol. It was once used in virtually every manuscript and early printed book to mark paragraph breaks. Over time, it migrated to the margins, then disappeared entirely from running text. Now it's mostly a formatting tool in word processors.
Corn
The dagger and double dagger — the obelus and diesis — those were once common in general writing, not just academic footnotes.
Herman
In the eighteenth century, you'd see daggers used to mark dubious passages or to indicate a footnote. Now they're almost exclusively used for death dates in biographies and for statistical significance in academic papers. The usage contracted dramatically.
Corn
What drove the contraction? Why did English punctuation shrink rather than expand?
Herman
A few forces working together. One is the rise of prescriptive grammar in the eighteenth century — Bishop Robert Lowth's grammar in seventeen sixty-two, Lindley Murray's grammar in seventeen ninety-five. These books tried to codify "correct" English, and part of that project was simplifying and standardizing punctuation.
Corn
The grammarians as gatekeepers.
Herman
They were effective. The nineteenth century saw a further push toward standardization through mass education. When you're teaching millions of children to read and write, you need a system that's teachable. A smaller set of rules is easier to transmit than a sprawling inventory of special-case marks.
Corn
Standardization was partly a pedagogical decision, not a linguistic one.
Herman
And then the typewriter delivered another blow. Most typewriters had a very limited character set — you got a period, a comma, a colon, a semicolon, a question mark, an exclamation point, quotation marks, apostrophes, parentheses, and hyphens. That's about it. No em dash, no en dash, no ellipsis as a single character.
Corn
People made do. Two hyphens became a dash. Three periods became an ellipsis.
Herman
Those workarounds became conventions in their own right. But the typewriter keyboard essentially froze the punctuation inventory for a century. Even after word processors gave us access to a much wider character set, the habits were set.
Corn
This brings us to Daniel's em dash grievance. Let's talk about that directly, because it's a fascinating case study in how punctuation becomes culturally loaded.
Herman
The em dash has been around since the eighteenth century — it's named because it's roughly the width of the letter M in a given typeface. It's used to indicate a break in thought, an interruption, or a parenthetical that's more emphatic than parentheses. And for most of its history, it was just another tool in the writer's kit.
Corn
Then large language models arrived, and suddenly the em dash became a tell.
Herman
There's been a lot of discussion about this in writing and editing circles over the past couple of years. AI generated text, particularly from certain models, uses em dashes much more frequently than most human writers do. Not because the AI "likes" em dashes, but because its training data — formal writing, journalism, academic prose — uses them at a certain rate, and the model reproduces that pattern.
Corn
It's a statistical artifact, not a stylistic choice by the AI.
Herman
And the result is that human writers who happen to use em dashes — people like Daniel, people like me, frankly — are now getting their work flagged or questioned. I've seen editors asking writers to reduce em dash usage because it "looks AI generated." Which is absurd if you think about it for more than five seconds.
Corn
It's guilt by punctuation association. And it's doubly ironic because the em dash is one of the most human marks — it conveys hesitation, interruption, the rhythm of actual speech. It's the opposite of mechanical.
Herman
Emily Dickinson used em dashes constantly. She was not a language model. James Joyce, Virginia Woolf, David Foster Wallace — all heavy em dash users. The mark has a long literary pedigree.
Corn
The perception has shifted. It's now a marker of suspicion in a way it never was before.
Herman
That's going to have downstream effects. If writers start avoiding em dashes to dodge false AI accusations, the mark's usage in human writing will actually decline. The AI detection tail is wagging the stylistic dog.
Corn
I'd argue that's already happening. I've noticed myself second-guessing em dashes in my own writing, which is infuriating because they serve a function that parentheses and commas don't quite cover.
Herman
What function do you think that is?
Corn
Em dashes interrupt. A parenthetical is an aside you're meant to register but not dwell on — it lowers the volume. An em dash creates a break that demands attention. It's more dramatic, more conversational. If I'm writing something that's meant to sound like thought happening in real time, em dashes are the right tool.
Herman
That's exactly the distinction. And commas can't do the same work because they're too lightweight — they don't create enough separation. A comma between two independent clauses is often just a grammatical error waiting to happen.
Corn
We're potentially losing a useful mark because of a statistical quirk in AI training data.
Herman
Which brings us back to Daniel's broader question about contraction. We're living through another period of punctuation change right now, and it's being driven by technology — just like the printing press and the typewriter drove change in their eras.
Corn
What other marks are shifting right now?
Herman
The period at the end of a text message has become aggressive. There's actual research on this — a study from Binghamton University in twenty fifteen found that text messages ending with a period are perceived as less sincere, more abrupt, even angry.
Corn
I've felt this. If someone texts me "okay" with no period, it's neutral. If they text "okay." with a period, I assume they're upset about something.
Herman
That's a genuine linguistic change happening in real time. The period, which was the first punctuation mark ever invented, is acquiring a new pragmatic meaning in digital contexts. It's no longer just a neutral full stop — it carries emotional weight.
Corn
The exclamation point has shifted too. It used to indicate genuine excitement or emphasis. Now it's often a politeness marker — "thanks!" — where the unadorned version reads as curt.
Herman
The ellipsis has been completely repurposed by older generations in ways that confuse younger readers. When someone over fifty writes "I'll see you there..." they often mean it as a neutral trailing off. Younger readers interpret it as passive-aggressive or ominous.
Corn
There's a genuine generational punctuation gap. I've seen email threads where the punctuation choices alone caused miscommunication.
Herman
None of this is being dictated by grammarians or style guides. It's emerging organically from millions of daily interactions, the same way punctuation originally emerged from the needs of orators and scribes.
Corn
Let's circle back to the history for a moment. You mentioned the question mark earlier, but I want to dig into its origin story specifically. Where did that symbol actually come from?
Herman
The leading theory is that it evolved from the Latin word "quaestio," meaning question. Scribes would write "qo" at the end of an interrogative sentence as an abbreviation. Over time, the "q" migrated above the "o," and the shapes merged and stylized into the curly symbol we use today. So the question mark is literally a compressed word.
Corn
That's remarkably efficient. Two letters collapsing into a single mark over centuries of handwriting.
Herman
The exclamation point has a similar origin story. It likely comes from the Latin "io," an exclamation of joy. Scribes wrote the "i" above the "o," and the combination eventually became the mark we recognize.
Corn
Both of our sentence-terminal marks are medieval emoji, essentially. Compressed emotional expressions.
Herman
That's not a bad way to think about it. And the ampersand — the and sign — is a ligature of the letters "e" and "t" from the Latin "et." If you look closely at some ampersand designs, you can still see the e and the t.
Corn
The ampersand has had an interesting trajectory. It was once considered a letter of the alphabet — literally the twenty-seventh letter. Schoolchildren in the early nineteenth century would recite the alphabet ending with "x, y, z, and per se and." The phrase "and per se and" got slurred into "ampersand.
Herman
Which is one of my favorite etymologies in all of English. A whole word was born from children reciting the alphabet. And the ampersand itself has survived while other ligatures died out — you don't see the "ct" ligature or the "st" ligature in modern English text, but the ampersand persists.
Corn
What about quotation marks? Those feel like they should have an interesting history.
Herman
They do, and it's surprisingly recent. Quotation marks as we know them — the double inverted commas — emerged in the late seventeenth century. Before that, writers indicated quoted speech in various ways — underlining, marginal notes, different typefaces. The diple, a caret-like mark in the margin, was used in medieval manuscripts to indicate something noteworthy, but it wasn't a quotation mark in the modern sense.
Corn
When did quotation marks become standardized?
Herman
The eighteenth century, alongside the rest of the punctuation system. But there's a persistent transatlantic divide. American English uses double quotation marks for primary quotes and single for nested quotes. British English traditionally does the reverse. And neither system is more logical — they're just conventions that happened to crystallize differently.
Corn
I've always found it strange that we use the same mark for opening and closing in most typefaces. Curly quotes solve that, but straight quotes don't.
Herman
Again, blame the typewriter. Straight quotes were a space-saving compromise. A single key for both open and close, for both single and double quotes, and for the apostrophe. Four functions, one key.
Corn
The typewriter really did a number on typographic richness.
Herman
It did, and we're only now recovering some of what was lost. Modern word processors can auto-convert straight quotes to curly quotes, insert proper em dashes, handle ellipses as single characters. The technical capacity is there. But the cultural knowledge — knowing when and why to use these marks — is eroding.
Corn
That's an interesting tension. We have more typographic capability than ever before, but probably less punctuation literacy than a century ago.
Herman
I think that's fair. And it connects to something Daniel hinted at — the idea that punctuation is partly idiosyncratic, that there can be multiple correct ways to punctuate the same sentence. That's true, but it's also true that the range of acceptable variation has narrowed over time.
Corn
Because of style guides?
Herman
Style guides, education systems, and now grammar checking software. If you write in Microsoft Word or Google Docs, the software is constantly nudging you toward a particular punctuation style. It flags missing commas, suggests removing "unnecessary" punctuation, and generally enforces a fairly narrow view of correctness.
Corn
Which is probably making our punctuation more uniform over time.
Herman
And that's not entirely bad — consistency helps readability. But it does mean we're losing some of the expressive range that punctuation can provide.
Corn
Let's talk about an example of that expressive range. The semicolon in particular — you mentioned it's been controversial forever. Why does it provoke such strong feelings?
Herman
I think because it's the most visibly "writerly" mark. A period or a comma is invisible to most readers. But a semicolon signals that the writer made a deliberate choice. And for some readers, that signal reads as pretension rather than precision.
Corn
Vonnegut's college comment basically accuses semicolon users of showing off.
Herman
There's a class dimension to that criticism that doesn't get discussed much. The semicolon is associated with formal education, with academic writing, with a certain kind of literary prestige. Rejecting it is partly a rejection of those institutions.
Corn
Which is ironic given that the semicolon was invented by a commercial printer trying to sell books.
Herman
It wasn't born in a university. It was born in a Venetian print shop. But it acquired cultural baggage over centuries.
Corn
Do you use semicolons in your own writing?
Herman
I do, but selectively. I think the semicolon is best when it connects two clauses that are in tension or conversation with each other — where the relationship between the thoughts matters. If you're just using it to avoid a period, it's probably the wrong choice.
Corn
That's a good heuristic. Use the semicolon when the connection between ideas is the point.
Herman
That's actually a good general principle for punctuation. Each mark should do work that another mark can't do as well. If you can replace your em dash with a comma and lose nothing, you probably don't need the em dash.
Corn
We've covered the history, the contraction, the cultural shifts. Let's address Daniel's question about whether punctuation marks emerged gradually or all at once.
Herman
The answer is definitively gradually, from multiple sources. The period, comma, and colon trace back to Aristophanes in Alexandria. Word spacing came from Irish monks. The semicolon from a Venetian printer. The question mark and exclamation point evolved from Latin abbreviations. Quotation marks emerged in the seventeenth century. The em dash in the eighteenth. And we're still inventing new marks — the interrobang in the nineteen sixties, the sarcasm tilde in internet culture, the hashtag which has become a kind of meta-punctuation.
Corn
The hashtag is an interesting case. It started as a technical marker on Twitter and has evolved into a genuine rhetorical device. You can use a hashtag to indicate tone, to undermine your own statement, to add a layer of irony.
Herman
Pound sign becomes metadata tag becomes rhetorical punctuation. That's a new thing in the history of written language.
Corn
Emoji are arguably a form of punctuation too. They're not replacing words — they're adding paralinguistic information the way tone of voice does in speech.
Herman
There's a serious argument that emoji are the first major expansion of the punctuation system in centuries. They do what the percontation point and the irony mark were trying to do — convey emotional and tonal information that plain text misses.
Corn
The contraction may be reversing. We lost a bunch of marks in the eighteenth and nineteenth centuries, and now we're inventing new ones again.
Herman
But there's an important difference. The old marks were standardized through print. The new marks are emerging through networked digital communication, and they're not being codified by any central authority. It's a more organic, more chaotic process.
Corn
Which is actually closer to how punctuation originally evolved — through the practices of individual scribes and printers, not through top-down standardization.
Herman
We're returning to a medieval model of punctuation evolution, just at internet scale and speed.
Corn
Before we wrap up, I want to touch on one more thing — the relationship between punctuation and breath. You mentioned that the earliest marks were for orators managing their breathing. Do you think there's still a connection between punctuation and the physical experience of language?
Herman
When I'm writing, I'm subvocalizing — I can feel where the pauses go, where the intonation rises. Punctuation is a way of notating that internal voice. And when punctuation is wrong, you can feel it physically — the sentence doesn't breathe right.
Corn
That's why reading your own writing aloud is so revealing. The places where you stumble are almost always punctuation problems.
Herman
There's neuroscience behind this. Reading activates the same brain regions as speech production, even when you're reading silently. Punctuation is part of how the brain simulates the spoken voice. A period triggers a brief neural pause. A question mark triggers a rise in simulated intonation.
Corn
Punctuation isn't just a visual convention. It's mapping onto something deep in how we process language.
Herman
Which is why the history of punctuation matters. These marks aren't arbitrary decorations. They're the residue of centuries of humans trying to capture voice on a page. Every comma is a breath, every period is a silence, every question mark is a rising tone. The technology of writing has changed, but the underlying problem — how do you encode a living voice in dead marks — is the same one Aristophanes was trying to solve two thousand years ago.
Corn
That's a good place to land. Punctuation is fossilized speech.
Herman
I like that.
Corn
Now: Hilbert's daily fun fact.

Hilbert: During the interwar period, katabatic wind patterns over the Drake Passage were clocked at velocities sufficient to displace an estimated one point three million metric tons of surface water per second during peak outflow events.
Corn
One point three million metric tons per second. That's not a wind, that's a freight train made of air.
Herman
I don't even know what to do with that number.
Corn
Here's the forward-looking thought. Daniel's em dash problem is a preview of something bigger. As AI generated text becomes more common, the markers we use to judge authenticity — punctuation, word choice, sentence structure — are all going to shift. We're going to have to figure out what "human writing" even means when the statistical fingerprints keep changing. That's not just a punctuation problem. That's an epistemology problem.
Herman
The history suggests we'll figure it out messily, over decades, with a lot of wrong turns. But we'll figure it out. We always have.
Corn
Thanks to our producer, Hilbert Flumingtop. This has been My Weird Prompts. Find us at myweirdprompts dot com or wherever you get your podcasts. We'll be back with another one soon.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.