So Daniel sent us this one, and it is a good one. He's asking about the Voynich Manuscript — a fifteenth-century illustrated codex written in an unknown script that has defeated every cryptographer, linguist, and AI model that has ever tried to crack it. The physical object is real: vellum carbon-dated to between fourteen-oh-four and fourteen thirty-eight, roughly two hundred and forty surviving pages, with sections on botany, astronomy, biological imagery, pharmaceutical recipes. The script it's written in shows all the statistical hallmarks of a real language. And yet nobody has decoded a single word in over a century of serious attempts. Daniel's core question isn't really "what does it say" — it's the deeper one: why has this specific thing defeated everyone who has ever tried?
That reframe is doing a lot of work, and it's the right one to start with. Because the manuscript has this gravitational pull — every few years someone announces they've cracked it, the press goes wild, and then within about six months the academic community quietly dismantles the claim and we're back to zero. That pattern itself is almost the most interesting data point.
It's like a roach motel for credibility. Brilliant people check in...
And their reputations check out, yeah. By the way, today's episode is being written by Claude Sonnet four point six — just flagging that early so the meta-layers are accounted for. Anyway. The object itself. Let's actually spend some time there because I think people underestimate how strange the physical artifact is before you even get to the text.
Walk me through it.
So the vellum — that's treated animal skin, typically calf — was carbon-dated in two thousand nine by the University of Arizona. The range came back fourteen-oh-four to fourteen thirty-eight. That's a tight window. It places the physical material firmly in the early fifteenth century. Now, that doesn't mean it was written then — you could theoretically write on old vellum — but faking aged vellum to that degree of consistency would have been extraordinarily difficult to do before modern chemistry. The ink chemistry has also been analyzed, and it's consistent with the dating. So the object is almost certainly genuinely medieval.
Right, so the hoax hypothesis, if it exists, has to be a medieval hoax. Not a Victorian forgery.
That distinction matters enormously, and it's one that gets muddied in popular coverage. Wilfrid Voynich — the rare book dealer who acquired the manuscript in nineteen twelve and whose name it now carries — was occasionally suspected of fabricating the whole thing. But the carbon dating essentially rules him out as the forger of the physical object. He might have misrepresented its provenance, but he didn't make the vellum.
What do we actually know about provenance before Voynich got hold of it?
There's a letter from sixteen sixty-six — from Johannes Marcus Marci to Athanasius Kircher, the Jesuit polymath — that accompanied the manuscript and claimed it had previously been owned by Holy Roman Emperor Rudolf the Second, who supposedly paid six hundred ducats for it. Rudolf was famous for collecting curiosities and oddities, which fits. Before that, the trail goes cold. We have a possible Roger Bacon attribution floating around — Bacon was a thirteenth-century English friar known for his interest in ciphers — but that attribution doesn't hold up to serious scrutiny.
Six hundred ducats for something nobody could read. Rudolf was either very optimistic or very credulous.
Or he knew something we don't, which is a hypothesis I'd love to dismiss but can't quite. The manuscript was clearly valued by people who had access to it. That doesn't mean it was legible to them either.
So let's get into the text itself. Because the statistical properties are where this gets genuinely weird.
Genuinely weird is the right framing. So the script — researchers call it Voynichese — has been subjected to every quantitative linguistic analysis you can imagine. And it passes the tests that real languages pass. Zipf's law, for instance: in natural language, word frequency follows a power law distribution. The most common word appears roughly twice as often as the second most common, three times as often as the third, and so on. Voynichese follows this distribution. Word-length distributions also fall within the range of natural languages. There's internal structure — certain glyphs tend to appear at word beginnings, others at endings, others in the middle. That's morphological structure, which is a feature of real languages.
So if you just handed someone the statistics without telling them what they were looking at, they'd say this is a language.
They would. And that's exactly what makes it so disorienting. Because when you go deeper, the patterns start to break down in ways that no known language does. The repetition rate is anomalously high. Words repeat within the same line, sometimes consecutively, in ways that would be grammatically impossible in virtually any natural language. There are almost no words that appear only once — what linguists call hapax legomena. In a real text of this length, you'd expect a significant tail of unique words. Voynichese barely has one.
Which would suggest either a very restricted vocabulary, or that the surface text is encoding something at a level of abstraction that erases rare words.
Or that whoever produced it was simulating language without fully understanding how rare words work. That's one of the arguments for the hoax theory, actually. A sophisticated medieval scholar might know that real writing has frequent words and less frequent words, might try to build that in, but might not have intuited that real language also has a long tail of words that appear only once or twice.
That's an interesting tell. Like a forger who gets the overall patina right but misses the microscopic grain structure.
Good analogy, and I'll note that's our one analogy budget for the episode spent, so let's use it wisely. The other deeply strange property is what Prescott Currier identified in the nineteen seventies — he was a U.S. Navy cryptanalyst, serious credentials — he found that the manuscript appears to be written in at least two distinct dialects or hands. He called them Currier A and Currier B. They have different statistical properties, different preferred glyphs, and they cluster in different sections of the manuscript. That suggests either multiple authors, or one author with a very unusual compositional method, or something about the encoding scheme that changes across sections.
Multiple authors writing in the same unknown script across what would have been a significant period of time. That's hard to fake casually.
It is. And it's one of the reasons the dismissive "it's obviously a hoax" takes frustrate me. The object resists easy dismissal in both directions — it's not obviously a language, and it's not obviously a hoax. It sits in this genuinely uncomfortable middle space.
Let's talk about the career graveyard aspect, because I find this almost more interesting than the manuscript itself. William Friedman.
William Friedman is probably the most accomplished cryptanalyst who ever lived, without much exaggeration. He broke PURPLE — the Japanese diplomatic cipher — which was a genuine cryptographic achievement of the first order. He essentially built modern American signals intelligence. He spent years on the Voynich Manuscript, on and off, and his conclusion was that it was probably a constructed language — an artificial language — rather than a cipher of a natural language. He died without cracking it. And importantly, he organized a team effort — brought together other professional cryptanalysts — and they also failed.
What does it mean that someone who could break PURPLE couldn't break this?
It means one of a few things. Either the manuscript doesn't contain the kind of redundancy and structural regularity that cipher-breaking relies on — in which case it might not be a cipher at all — or the cipher is so unusual that conventional cryptanalytic approaches simply don't get purchase on it. Friedman's team were working with mid-twentieth-century methods, but even modern computational approaches haven't succeeded, so I don't think the methods are the bottleneck.
John Tiltman.
Brigadier John Tiltman, British cryptanalyst, worked for GCHQ, also one of the best who ever lived — he was involved in early Enigma work. He spent about thirty years on the Voynich Manuscript across his career. His conclusion was essentially that the text has a word-by-word structure unlike anything he'd encountered in any cipher system. He noted that the way words follow each other has almost no long-range dependencies — no sentence-level grammar that persists across lines. Which is bizarre. Natural language has long-range structure. Even heavily encrypted text tends to preserve some statistical shadows of the underlying language's structure. Voynich has almost none of that.
So it's not just that the code is hard. It's that the structural properties of the text don't match what you'd expect from encoded text.
Which is the crux of it. A cipher is a transformation applied to a message. The message has structure, and ciphers preserve some of that structure even while obscuring the content. Voynich either has no underlying message with that kind of structure, or it's been encoded through a process that's genuinely unlike anything we've classified.
Or the underlying message is in a language whose structure is so different from anything in the European tradition that the shadows look like noise.
That's a real possibility. There were proposals — some serious, some less so — that the underlying language might be something like Nahuatl, or an East Asian language, or something entirely outside the European linguistic family. The statistical properties would look different from what Tiltman was trained to expect. The problem is that none of those proposals have survived contact with the actual statistical analysis. When you try to map Voynichese onto Nahuatl phonology, for instance, the fit is terrible.
Let's get to the recent failures, because I think this is where it gets almost philosophically interesting. Gerard Cheshire in twenty nineteen.
Cheshire published a paper claiming the manuscript was written in proto-Romance — a reconstructed ancestor of the Romance languages — and that he'd decoded significant portions. The paper got enormous press coverage. Within weeks, professional linguists and historians had torn it apart comprehensively. The proposed translations were internally inconsistent, the grammatical framework was invented rather than documented, and the claimed proto-Romance forms didn't correspond to any attested or reconstructed proto-Romance phonology. It was a case of someone pattern-matching hard enough to find apparent meaning without the disciplinary rigor to test whether that meaning was real.
Which is a failure mode that's very human and very understandable. You stare at something long enough, you will find patterns.
Pareidolia for language. And the problem is that Voynichese is just structured enough to encourage this. It has enough regularity that you can convince yourself you're seeing something. The bar for "I've found a pattern" is low. The bar for "this pattern is meaningful and reproducible and consistent" is extremely high, and nobody has cleared it.
What about the AI and machine learning attempts? Because this is where I'd expect our audience to have the most current interest.
So there have been several serious attempts to apply machine learning to the manuscript. The most cited one is probably a twenty nineteen paper from the University of Alberta — Greg Kondrak's group — which used a neural network trained on a large corpus of languages to try to identify which language family Voynichese might belong to. Their best result was a suggestion that it might be encoded Hebrew. The method was clever: they hypothesized that vowels had been dropped — which is a known scribal abbreviation technique — and then tried to map the remaining consonantal skeleton to Hebrew roots.
And?
And the proposed translations were, to put it diplomatically, not convincing. They found phrases that could be interpreted as Hebrew words if you applied enough transformations, but the proposed meanings didn't cohere into anything semantically sensible across multiple lines. Hebrew scholars looked at it and were not persuaded.
What about large language models specifically? Because the intuition would be that something trained on a vast enough corpus of human language might pick up on something that rule-based approaches miss.
This is where it gets genuinely interesting as a benchmark question. LLMs work by learning statistical patterns of co-occurrence across enormous text corpora. The hypothesis would be that if Voynichese is a natural language or a transformation of one, an LLM might recognize structural fingerprints that human analysts miss. Several groups have tried this. The results have been consistently negative in the sense that no LLM has produced translations that survive scrutiny. But what's interesting is the failure mode — LLMs tend to produce fluent-sounding translations that are internally consistent but completely unverifiable. They're very good at generating plausible text. Whether that generated text corresponds to anything in the manuscript is a different question.
So the LLM failure is almost the opposite of the Cheshire failure. Cheshire found apparent patterns through human pattern-matching. LLMs generate apparent meaning through statistical generation. Neither approach can get traction on whether the manuscript actually contains meaning.
That's a sharp way to put it. And it points to a genuine epistemological problem: how would you even verify a correct translation? If someone produces a translation of the botanical section that says "this plant treats kidney ailments with the following preparation," how do you check that? You can't go find the plant — the plants are unidentifiable. You can't test the pharmaceutical claims. You can't compare the astronomical sections to dated observations. The manuscript is almost hermetically sealed from external verification.
That's actually a really important structural point. Most ancient texts, even ones we couldn't read for centuries, had external anchors. Egyptian hieroglyphics had the Rosetta Stone. Linear B had the reference point of known Mycenaean Greek culture. Voynich has almost nothing.
The two thousand twenty-five Yale digitization project — which produced extremely high-resolution multispectral imaging of the entire manuscript — was genuinely valuable from a material standpoint. Multispectral imaging can reveal ink layers, erasures, corrections, things invisible to the naked eye. And it confirmed some things: there are correction marks, there are places where glyphs appear to have been modified, there are faint traces of text that was partially erased. That's evidence of an author who was composing or editing, not just mechanically copying. But it didn't produce a decipherment. The higher resolution images just give us better data about an object we still can't read.
Let's go through the four main theories seriously, because I want to give each one its due rather than just listing them.
Good. Start with the one I find most intellectually honest, which is genuine unknown or constructed language. The case for this is that the statistical properties are real, that the manuscript shows signs of genuine composition, that the multi-author evidence suggests a community of practice rather than a single forger, and that the fifteenth century was a period of genuine interest in philosophical and constructed languages — Hildegard of Bingen had created a constructed language called Lingua Ignota two centuries earlier, and there were serious humanist projects around language construction in the Renaissance. The case against is that no constructed language of this complexity has ever been completely lost — they leave traces, references, explanations. Somebody who built a language this elaborate would presumably have wanted to explain it somewhere.
Unless the explanation was itself lost, or was intentionally suppressed. Mystery traditions of the period weren't exactly in the habit of writing user manuals.
True. And if this was associated with an alchemical or hermetic tradition — which the imagery in the pharmaceutical and biological sections is at least consistent with — deliberate obscurity was a feature, not a bug.
Second theory: elaborate hoax.
The hoax theory has to contend with the carbon dating, as we said. The vellum is genuinely fifteenth century. So you need a medieval hoaxer, not a Victorian one. The case for a medieval hoax is actually pretty interesting: the manuscript appears in Rudolf the Second's court, where there was a market for curiosities and where the price of six hundred ducats would have been a significant incentive. A skilled forger who understood enough about writing systems to create something that looked like a language could have produced this. The anomalous repetition patterns — the low hapax legomena count — could be evidence of someone simulating language without fully understanding its statistical properties. The case against: the sheer labor involved. This is two hundred and forty pages of consistent, carefully illustrated text. The illustrations are detailed. The consistency of the script is high. Producing this as a hoax would have been an enormous investment for an uncertain return.
Unless the hoaxer was working over a long period, or had collaborators. Which circles back to the Currier A and B evidence.
It does. A multi-author hoax is actually more plausible than a single-author one, in some ways, but it also requires more conspiracy and more coordination, which makes it harder to sustain.
Third theory: glossolalia or asemic writing. This one I find both the most dismissible and the most unsettling.
Why unsettling?
Because if it's true, then the statistical regularities we've been talking about are either a coincidence or they're evidence that human pattern generation — even uninstructed, even in an altered state — produces language-like structures spontaneously. Which says something strange about cognition.
That's actually a serious point. There's research on glossolalia — speaking in tongues — that shows it follows phonological constraints of the speaker's native language even when the speaker believes they're producing something divine and language-free. So if someone were producing Voynichese in a visionary or trance state, they might be generating something that has the statistical fingerprints of a language precisely because it's produced by a language-using brain. The case against glossolalia as an explanation is the sheer consistency and the visual complexity of the illustrations. Glossolalia is typically spontaneous and variable. This is a carefully produced manuscript. Those things don't fit together easily.
Fourth theory: heavily encrypted real text.
This is the one that professional cryptanalysts have historically found most technically interesting, and also the one they've been least able to make progress on. The case for it is that if you assume a sophisticated enough encryption scheme — something like a Cardan grille, or a null cipher where only certain characters carry meaning, or a homophonic substitution cipher with a very large symbol set — then the statistical properties we observe could be artifacts of the encryption rather than properties of the underlying text. The case against is that the text is too long for most known historical cipher systems to maintain coherence across. Homophonic substitution ciphers of the period typically had much smaller symbol sets. A null cipher of this length would require extraordinary discipline to produce consistently.
And the problem with all four theories is that none of them generate testable predictions that the manuscript satisfies.
That's the core of it. A good theory of the Voynich Manuscript would say: if X is true, then we should observe Y in the text, and Y should be absent if X is false. Nobody has produced a theory that clears that bar. The Cheshire failure is a template for what goes wrong: you produce a theory, you find apparent confirmations, but the confirmations are all post-hoc pattern matches rather than novel predictions.
So let's talk about why this object specifically has this property. Because there are other undeciphered scripts — Linear A, Proto-Sinaitic, the Indus Valley script. What makes Voynich different?
Several things. Linear A, for instance, has a known relative — Linear B — and we know it was used for administrative records in a known cultural context. The Indus Valley script has a known archaeological context, known artifact types, some understanding of the culture that produced it. Voynich has essentially none of that. We don't know who wrote it, where it was written, what tradition it belongs to, what language it might encode if it encodes one at all. Every other undeciphered script has at least some external anchors. Voynich has almost none.
And the illustrations don't help as much as you'd think.
They actively confuse the issue, in some ways. The botanical illustrations show plants that don't correspond to any known species. Some of them have been tentatively matched to real plants — sunflowers have been proposed, morning glories — but the matches are loose enough to be unconvincing. The astronomical diagrams show what look like zodiac symbols but in arrangements that don't correspond to any known astrological tradition. The biological section with the women in pools connected by tubes has been interpreted as everything from anatomical diagrams to alchemical processes to cosmological imagery. The illustrations are clearly meaningful to whoever produced them, but they don't map onto any external reference system we can identify.
The women in pools connected by tubes is genuinely one of the stranger images in the history of manuscripts. I want to just sit with that for a moment.
It is. And what's interesting from a production standpoint is that the biological section is one of the most labor-intensive parts of the manuscript. Whoever made this was not dashing it off. The care and consistency of execution across the whole document is remarkable. There's a kind of seriousness to it that's hard to reconcile with either pure hoax or pure glossolalia.
Here's the question I keep coming back to: is the manuscript getting harder to decode over time, or easier?
And I'm genuinely not sure of the answer. On one hand, we have better tools — computational analysis, multispectral imaging, access to vastly larger corpora of comparative texts, machine learning approaches that didn't exist fifty years ago. On the other hand, the social and intellectual context of the fifteenth century is getting further away from us. Whoever produced this was working within a set of assumptions, traditions, and reference points that were live for them and are increasingly obscure to us. The further we get from that context, the more we might be missing things that would have been obvious to a contemporary reader.
Which suggests that if there's a key to this, it might be sitting in an archive somewhere. Not a computational breakthrough, but a historical one. A reference to this manuscript, or to its author, or to the tradition it belongs to.
That's actually what I think is most likely, if there's a solution to be found. The manuscript has been in the Yale Beinecke Library since nineteen sixty-nine. The digitization projects have made the images widely available. The computational analysis has been thorough. What hasn't been exhaustively searched is the archival record — the letters, inventories, account books, and notarial records of the fifteenth-century communities where this might have been produced. That's painstaking archival work, not algorithm work.
Let's talk about the AI benchmark angle, because Daniel flagged this and I think it's genuinely important for where the conversation is going.
So there's a framing that's been gaining traction in AI research circles, which is that the Voynich Manuscript represents a kind of ideal test case for genuine linguistic reasoning in AI systems. Here's why: if an LLM produces a convincing translation of a known ancient text, you can't be sure it hasn't just memorized the translation from its training data. Voynich has no known translation. If a model produces a coherent, internally consistent, externally verifiable decipherment of Voynichese, that would be strong evidence of something genuinely new — an ability to reason about linguistic structure from first principles rather than pattern-matching to known content.
But you said "externally verifiable," and we already established that external verification is almost impossible.
Which is the catch. Even if an AI produced a translation, how would you know it was right? You'd need some kind of independent confirmation, and almost no such confirmation exists. The manuscript is designed — whether intentionally or accidentally — to resist the kind of verification that would let you know you'd succeeded. It's not just a hard problem. It's a problem where success is almost as hard to confirm as the problem itself is to solve.
That's a genuinely unusual property for a benchmark. Most benchmarks have ground truth.
Most benchmarks have ground truth. Voynich doesn't, and that might be the deepest reason it's resisted solution for so long. It's not just that the cipher is hard. It's that the conditions under which you could confirm a correct solution are almost entirely absent. And that asymmetry — between the effort required to produce a credible-sounding solution and the impossibility of verifying it — is exactly the environment that produces the Cheshire-style failures. It's easy to produce something that sounds like a solution. It's almost impossible to demonstrate that your solution is actually correct.
Which means every "I've solved it" announcement is, structurally, unfalsifiable. You can't prove them right, and you can't fully prove them wrong either.
You can prove them wrong in the sense that you can show their proposed translations are internally inconsistent, or that their proposed grammar doesn't cohere, or that their claimed proto-language doesn't have the phonological properties they claim. That's what happened to Cheshire. But you can't prove someone wrong by saying "the correct translation says something different," because nobody knows what the correct translation says.
So what does the century of failure actually tell us? What's the affirmative conclusion, if there is one?
I think there are a few. One: whatever the manuscript is, it's not a simple cipher of a well-known European language. If it were, Friedman and Tiltman would have found it. Two: the statistical properties are too consistent and too sophisticated to be pure noise. Something deliberate produced them. Three: the failure of every external anchor — no known plant, no known astronomical system, no known language family — suggests either that the producer deliberately constructed a self-referential system, or that the cultural context was so marginal or esoteric that it left no other traces. Four: the manuscript is genuinely anomalous. It's not just hard. It occupies a category that our existing analytical frameworks don't map onto cleanly.
And that last point is maybe the most important one for thinking about what it would take to solve it. If it's genuinely anomalous, then applying harder versions of existing approaches probably won't work. You'd need a conceptual breakthrough — a new framework — not just more compute or more careful analysis.
Which is rare. And which is why I'm genuinely uncertain whether it will ever be solved. Not because I think it's beyond human ingenuity, but because conceptual breakthroughs in cryptography and linguistics don't arrive on demand. You can't schedule one.
Practical question: if you were advising someone who wanted to seriously engage with this problem, what would you tell them to actually do?
I'd say three things. First, read the Currier papers carefully. The Currier A and B analysis is the most rigorous quantitative work done on the manuscript, and understanding why the two hands are different is foundational. Second, engage with the Voynich Manuscript community — there's a serious online community of researchers who have built extremely detailed databases of the text, statistical analyses, comparative studies. The work that's been done there is genuinely sophisticated and freely available. Third, and this is the one that I think is most underrated: do the archival work. If you have Latin, if you have access to fifteenth-century European archives, the marginal notes and ownership marks on the manuscript itself are a starting point. The six-hundred-ducat Rudolf claim traces back to the Marci letter. There may be other letters. There may be other references in contemporaneous documents that haven't been found yet.
And what about the AI angle for someone building systems?
If you're working in AI and you're interested in this as a benchmark, the honest framing is that it's a benchmark for something we don't fully know how to measure yet. We don't have ground truth. What you can test is whether your system produces internally consistent structures — whether its proposed grammar is coherent, whether its proposed translations don't contradict each other, whether the statistical properties of its proposed underlying language are consistent with what Voynichese looks like after decryption. That's not the same as being right, but it's a more honest version of progress than "we found a translation."
The manuscript as a mirror. What you see in it tells you more about your analytical framework than about the manuscript itself.
That's maybe the most accurate summary of a century of attempts. Every methodology that's been applied to it has revealed its own assumptions and limitations more clearly than it's revealed anything about the text.
I want to end on this: do you think it's solvable?
I genuinely don't know, and I want to be honest about that rather than hedging toward optimism or pessimism for rhetorical effect. If it's a hoax — and I think the probability is non-trivial — then there's nothing to solve. There's no underlying message. The statistical properties are artifacts of a sophisticated simulation, and the correct answer is "this is an elaborate constructed object with no content." That would be a valid solution, but it might be unprovable. If it's a genuine language or cipher, then I think it's solvable in principle, but it may require the kind of lucky archival discovery that you can't engineer. The honest answer is: maybe, and the conditions under which it becomes solvable are not fully under our control.
Which is actually a more interesting answer than "yes, AI will crack it eventually" or "it's definitely a hoax." The uncertainty is the real content.
The uncertainty is the real content. And I think that's why it keeps drawing serious people in. It's not just a puzzle. It's a puzzle that sits at the intersection of cryptography, linguistics, history, and philosophy of knowledge. What does it mean to have a text you can't read? What does it mean to have a system that looks like language but might not be? Those questions don't have easy answers, and the manuscript is a very concrete, very physical way to confront them.
Big thank you to Hilbert Flumingtop for producing the show — as always, the thing exists because of him. And Modal keeps our pipeline running; if you're building something that needs serverless GPU infrastructure, they're genuinely worth a look.
This has been My Weird Prompts. Find all two thousand one hundred and forty-nine episodes at myweirdprompts.com, or follow us on Spotify if that's your habitat.
See you next time.