Artificial Wasteland · a combine · Language seam

The Invariant of Relabeling

Two famous cracks — a cipher called unbreakable for three centuries, and a script that sat unread for three thousand years — turn out to be the same move. A code or a script assigns arbitrary labels to an underlying system. The labels are free. But some structure among the labels is forced by the system, and so it cannot be moved by relabeling. You grab the system by that invariant — before you can read a single symbol — and then one anchor breaks the rest.

This page joins two earlier layers and tries to make the whole exceed the stack:
What the Cipher Couldn't Hide — the index of coincidence breaks Vigenère.
The Grid That Spoke Greek — Linear B cracked from pure structure.
Nothing here is an original finding; the new thing is the join — and one number it lets you state exactly.

Movement I · the cipher's invariant

The shape a costume can't change

A simple substitution cipher swaps every letter for another symbol — A always becomes Q, B always W, and so on. It can change which letter wears which mask. It cannot change how often the masks recur. Below is one English passage. Scramble its alphabet as hard as you like and watch the two histograms: the one sorted by letter jumps; the one sorted by height never moves.

scramble the alphabet — the sorted shape holds

by letter A → Z — the labels move
sorted by height — the shape is fixed
index of coincidence
random floor (1/26)0.0385
unique scrambles tried0

The index of coincidence is the chance that two letters picked from the text are the same — a one-number summary of that sorted shape. It is identical, to twelve decimal places, under every one of the 26! ≈ 4×10²⁶ ways to relabel the alphabet. That invariance is the moat: the cipher hides the letters; it cannot hide the language.

Movement II · hiding the shape — and its shadow

Even the hiding leaves a trace

The Vigenère cipher defeats the attack above by rotating L alphabets in turn, so the same letter becomes different symbols at different positions. That genuinely flattens the lumpiness toward random. But the flattening is periodic — it repeats every L letters — and that period is a shadow the hiding casts. Slide the key length and watch the shape collapse; then read the period back off the columns.

raise the period — the whole-text shape flattens toward the floor

whole-text IC
plaintext (L=1)0.0714
random floor0.0385

At L=1 the text is in the clear and the shape is fully English. Raise L and the whole-text IC slides down toward the random floor — the costume is working.

read the period back — split into L columns, score each for English

Cut the ciphertext (enciphered with a real 7-letter key) into L columns — every L-th letter together. If L is the true period, each column was enciphered with a single alphabet, so its own IC springs back to English. Wrong L: still a mixture, still flat. The first column-set to read English is the period.

L = 1candidate periodL = 14
smallest English-like period
the cipher's actual keyVENTRIS

The key is VENTRIS — seven letters, for the man who broke Linear B, here hiding a cipher. The scan finds 7 with no help. Once the period is known, each column is a plain Caesar shift and a frequency table reads off its letter — the anchor that finishes the break.

Movement III · the same move, a smaller symmetry

The grid that needed only a name

Now the harder case. In a cipher the language is known — only the labels are hidden. In a lost script, the labels and the language are unknown. And yet the same handle is there. Alice Kober showed that Linear B's signs fall on a grid — you can tell which signs share a consonant and which share a vowel from how words inflect, with no idea what any sign says. That relational structure is an invariant of whatever sounds you assign later.

Here is the principle as a toy you can run: a 3×3 grid of signs, the rows consonants, the columns vowels. The structure alone — same-row, same-column — can't tell you which sounds go where. It leaves the grid fixed up to relabeling the rows among themselves and the columns among themselves. Pin a sign to a sound — guess a name — and the freedom collapses.

how many sound-assignments does the structure still allow?

consistent sound-assignments
36
= 3!·3!  (relabel rows, relabel columns)

Structure alone: 36 ways to read the grid. None of them is wrong yet — the signs simply don't say which sound they are. (Counted by brute force, every relation-preserving permutation enumerated — the same routine the verifier runs.)

That is the whole shape of the 1952 decipherment, in miniature. Michael Ventris guessed that four often-repeated sign-groups were Cretan place-names — Knossos, Amnisos, Tylissos, Phaistos. Each pinned a few rows and columns; signs that appeared in no place-name then came out forced, spelling real Greek — ti-ri-po, "tripod." The anchors broke the symmetry the structure had left.

the cipher

26!

relabelings the structure can't distinguish — every way to swap the alphabet.

The invariant: the index of coincidence, fixed over all of them.

The anchor: a frequency table — E is commonest — breaks it inside each column.

the grid

r!·c!

relabelings the structure can't distinguish — rename the consonants, rename the vowels.

The invariant: the grid of shared rows and columns, fixed over all of them.

The anchor: one guessed place-name pins specific sounds and breaks it.

Same move, different group. And the difference is the point: the cipher's residual ambiguity is astronomical — 26! per column — so you need a whole frequency table to break it. The grid's residual ambiguity is tiny — a handful of factorials — which is exactly why a single lucky name could do so much. The harder-looking problem hid a smaller symmetry.

The honest edges