<<Back | Next>>
In this section:Glyphs - Preamble
Majuscule Discussion
Minuscule Discussion
Numerals
The "Cipher"
Glyphs - Preamble
The beauty of the script is its arabesque flow. The majority of the glyphs are composed of one continuous line which keeps the author’s hand on the page. The notable exceptions would be in the crossing (like a ‘t’) of one glyph, and the addition of diacritical marks. Some of the glyphs are expansive in their curvature, while others are more constrained and knotted. There also seems to be a morphology to the cipher letters where they gain in complexity by adding loops to existing, base cipher letters, generally on their ascenders or descenders.
Given the vast number of glyphs in the majuscules alone, it has proven difficult to employ any transposition or substitution decipherment techniques. Off and on since 1999, I have attempted several variations on this as well as attempting to assign numerical values to each glyph. I think in one year, while in the depths of decipherment trial and error, I missed most of Christmas. There is also the possibility that some of the glyphs are dummies or nulls, or that every instance of glyph is different depending on which page it appears on, or if it is repeated (and thus has a different value). I have come at the majuscules with no less than 500 techniques, some of which have involved brute forcing. To no avail––yet. Or, should I say, ever. I'll explain below.
Serafini himself has said that there is no cipher; that it is asemic writing. This may be true (which would mean a lot of lost hours on my part as well as for several others), but it also may technically be false. Part of the mystique of the book is its cipher, and if it were to be deciphered then not only does it lose its highly enigmatic character, but the solved text may actually prove to be banal. If Serafini would want to keep the mystery of the book alive, he would have more to lose in saying it was written as an intentioned cipher because that would effectively be a challenge for someone to solve it––and perhaps that is not something he would care to risk. Who knows.
The other side, the one that employs the principle of charity in taking Serafini at his word, is that there is simply no complex encipherment at all. I have seen assertions by those we might call charlatans a la Athanasius Kircher claiming to have "broken" the Codex. Unfortunately for these hastily eager and possible victims of Newbold Syndrome, their "solutions" take enormous speculative leaps, and lay down assumptions that are not only too big to swallow, but are methodologically unsound. I think Nick Pelling puts it best in his blog post when he states:
It certainly resembles a cipher, with all the structure and nuances of calligraphy and page layout, though with the page numbering scheme the least confounding part (this turned out to be a contorted base-21 with a whole load of special rules to mess with your head). For what it’s worth, Luigi Serafini has claimed in several interviews that the text has no meaning, though curiously pretty much nobody believes he’s telling the truth.
So, is there a code or cipher in this? The short answer is no. Serafini himself stated that he intended to construct a book that mimicked how a pre-literate child might view an illustrated encyclopedia. Although the book is filled with tantalizing if not anachronistic allusions to mysterious incunabula, the Codex is more a concept book that plays with the phenomenology of perception. Keeping in mind the history of codes and ciphers whereby there was intentional obfuscation to protect knowledge from being intercepted by non-initiates or those who had the power to persecute certain groups, there is no such intention in the Codex whatsoever. There is no hidden knowledge in the Codex as such, no secrets or mysteries in a way that would betoken a need to protect sacred knowledge from the eyes of the profane or persecutive. There is a far simpler, innocent, and less dramatic explanation for the Codex: it was constructed as a kind of transposition of our perception, to return us to a state prior to the development of literacy. Although others may find in its pages evocative images that appear to have symbolic value tied to esoteric traditions, we have to remind ourselves that Serafini most likely drew from a variety of historical precedents in the development of this book, a kind of historical text remix that shares only an aesthetic resemblance to medieval herbals, bestiaries, and crypto-texts.
What we do know, courtesy of those who are mentioned in the numerals section below, is that the number system is decipherable. The fact that we could build an enormous and elaborate story around the base-21 numeric system, how it may allude to numerological significance does not lend said fictions any more solid basis in fact. Just like the afflicted shut-in mathematician in the movie Pi, one can see patterns anywhere, but that does not make the pattern connect to reality: it is simply a matter of perception. Ditto as well the speculations we could conjure with the base-21, its divisibility by seven, and the illustrations portraying rainbows with the seven colours of the visible spectrum: these too are far too fanciful and lead us on wild speculative tangents instead of enjoying the book as an artistic artifact, and acknowledging the more mundane truth of its intention and purpose.
I have compiled a list of all majuscule glyphs which is then followed by frequency analysis. This section is entirely dedicated to the majuscules, minuscules, and numerals Serafini uses. Please refer to the charts below.
Given the vast number of glyphs in the majuscules alone, it has proven difficult to employ any transposition or substitution decipherment techniques. Off and on since 1999, I have attempted several variations on this as well as attempting to assign numerical values to each glyph. I think in one year, while in the depths of decipherment trial and error, I missed most of Christmas. There is also the possibility that some of the glyphs are dummies or nulls, or that every instance of glyph is different depending on which page it appears on, or if it is repeated (and thus has a different value). I have come at the majuscules with no less than 500 techniques, some of which have involved brute forcing. To no avail––yet. Or, should I say, ever. I'll explain below.
Serafini himself has said that there is no cipher; that it is asemic writing. This may be true (which would mean a lot of lost hours on my part as well as for several others), but it also may technically be false. Part of the mystique of the book is its cipher, and if it were to be deciphered then not only does it lose its highly enigmatic character, but the solved text may actually prove to be banal. If Serafini would want to keep the mystery of the book alive, he would have more to lose in saying it was written as an intentioned cipher because that would effectively be a challenge for someone to solve it––and perhaps that is not something he would care to risk. Who knows.
The other side, the one that employs the principle of charity in taking Serafini at his word, is that there is simply no complex encipherment at all. I have seen assertions by those we might call charlatans a la Athanasius Kircher claiming to have "broken" the Codex. Unfortunately for these hastily eager and possible victims of Newbold Syndrome, their "solutions" take enormous speculative leaps, and lay down assumptions that are not only too big to swallow, but are methodologically unsound. I think Nick Pelling puts it best in his blog post when he states:
It certainly resembles a cipher, with all the structure and nuances of calligraphy and page layout, though with the page numbering scheme the least confounding part (this turned out to be a contorted base-21 with a whole load of special rules to mess with your head). For what it’s worth, Luigi Serafini has claimed in several interviews that the text has no meaning, though curiously pretty much nobody believes he’s telling the truth.
So, is there a code or cipher in this? The short answer is no. Serafini himself stated that he intended to construct a book that mimicked how a pre-literate child might view an illustrated encyclopedia. Although the book is filled with tantalizing if not anachronistic allusions to mysterious incunabula, the Codex is more a concept book that plays with the phenomenology of perception. Keeping in mind the history of codes and ciphers whereby there was intentional obfuscation to protect knowledge from being intercepted by non-initiates or those who had the power to persecute certain groups, there is no such intention in the Codex whatsoever. There is no hidden knowledge in the Codex as such, no secrets or mysteries in a way that would betoken a need to protect sacred knowledge from the eyes of the profane or persecutive. There is a far simpler, innocent, and less dramatic explanation for the Codex: it was constructed as a kind of transposition of our perception, to return us to a state prior to the development of literacy. Although others may find in its pages evocative images that appear to have symbolic value tied to esoteric traditions, we have to remind ourselves that Serafini most likely drew from a variety of historical precedents in the development of this book, a kind of historical text remix that shares only an aesthetic resemblance to medieval herbals, bestiaries, and crypto-texts.
What we do know, courtesy of those who are mentioned in the numerals section below, is that the number system is decipherable. The fact that we could build an enormous and elaborate story around the base-21 numeric system, how it may allude to numerological significance does not lend said fictions any more solid basis in fact. Just like the afflicted shut-in mathematician in the movie Pi, one can see patterns anywhere, but that does not make the pattern connect to reality: it is simply a matter of perception. Ditto as well the speculations we could conjure with the base-21, its divisibility by seven, and the illustrations portraying rainbows with the seven colours of the visible spectrum: these too are far too fanciful and lead us on wild speculative tangents instead of enjoying the book as an artistic artifact, and acknowledging the more mundane truth of its intention and purpose.
I have compiled a list of all majuscule glyphs which is then followed by frequency analysis. This section is entirely dedicated to the majuscules, minuscules, and numerals Serafini uses. Please refer to the charts below.
Your browser does not support viewing this document. Click here to download the document.
Your browser does not support viewing this document. Click here to download the document.
Deciphering notes - False Leads and Leads to Follow
As you can see here, although I am fairly certain that there is nothing to decipher, out of due diligence I was still guided by the need to test a few hypotheses.
As you can see here, although I am fairly certain that there is nothing to decipher, out of due diligence I was still guided by the need to test a few hypotheses.
- Attempted to use the majuscule grid in linguistic section. There is an inconsistency after the first two listed characters which led me to believe that perhaps these two were to be segregated from the rest, or else denoting a key in a transposition cipher. If so, it would have been immaterial for quick inspection purposes to trouble about which two alphabetic characters they might represent (I opted for LS - the initials of the author’s name). After unsuccessful attempts of running through the alphabet via this method (in some cases removing the two letters I substituted earlier so as to remove redundancy, and then leaving them in to be doubled), I attempted several other variations - this time with the author’s full name, and then by removing the redundant letters (“LUIGSERAFN”), reversing these, and doing counts of 13, 21, 21, 21, 21 to reach the full number of present cipher characters (97). As this worked out numerically, I set about testing words where which begin with the same two cipher letters, but “stepping” or transposing these by a certain number of letters by means of rotation (I attempted 6, 7, 13, and 21). This appeared to work in a preliminary examination by making plausible arrangements, but did not hold for other examples of the same doubled cipher letter. An example of transposition/rotation: if cipher a is A in first position, then if cipher a appears in second position, then it would be H if we add seven to A. So, in this example, if the cipher letters were “aaaa”, this would mean (using the Italian alphabet) AHQA. This has yet to yield favourable results, but this could be on account of assuming a) that the alphabet is Italian, or any other, and b) the order of the alphabet Serafini uses.
- Assumption that the majuscule cipher could be similar to Syriac alphabet, and thus possess no vowels (in Syriac, vowels are implied mater lectionis, and this could account for the heavy use of diacritics). If the majuscule cipher is indeed simply composed of consonants, this would explain the trebling of some cipher letters. However, so would the transposition/rotation method which may still be viable. Yet the idea that a) it is a syllabary, or, b) it is composed only of consonants may be easily jeopardized by the length of some of the words.
- It may be fruitful to consider that the majuscule glyphs may be either/or truncations of words, or an example of erratic scriptura continua.
Majuscule Discussion
The majuscule glyphs are not conjoined, and are usually to be found - with a few exceptions in the body text - as the title-headers of each page, and the title pages themselves. Some of these take on shapes that resemble characters of the standard alphanumeric we use today such as 6, E, D, G, C, and so forth. [more to come]
Minuscule Discussion
It proves difficult to perform a meticulous analysis of the minuscules due to their cursive script. Some of the glyphs are so similar that when they are conjoined it is hard to decide where to make the split. An MA thesis by Jeffrey Christopher Stanley attempted to use a computer to parse out the glyphs for the purposes of frequency analysis. To illustrate the problem, think of cursive writing in the Latin alphabet––if I conjoin a group of n's and m's, how can someone else tell which is an n and which an m? Generally, when we know the language, context provides us with the answer; in this case, we don't know the language so our decision-making process is liable to err.
There are some commonalities in the minuscules worth noting. The highest frequency single glyph would be what looks like a reposing E or a slightly tilted W with a dot inside it. Some have speculated this could be similar to an ampersand, or that it represents a preposition like to, or, of. It is also possible that it represents punctuation (this is not off the table since there seems to be a curious dearth of recognizable punctuation in the script––unless Serafini is consistently employing run-on sentences). The most common minuscule glyph-pairs begin with the single loop D-shaped glyph.
There is a considerable amount of looping, especially in the descenders of the glyphs.
The minuscule glyph set has far fewer variations than its larger cousin. It is vaguely reminiscent of standard cursive writing in that all glyphs are joined, and many of them are constructed of loops and tails. Most of the words end with a rising curled tail, often with a dot inside the curve. It is possible that Serafini has merged the function of the glyph-as-letter with that of punctuation. One may come to notice that there is very little evidence of recognizable punctuation save for the ending of a paragraph which commonly sports a dot followed by a second dot beneath a tilde. Other marks include a variation of the number system, mostly repeating series of unqualified I, V, and /\s. I say “unqualified” on account of these numerals appearing without the usual number set of 1-22 as it appears in the page numbers. There are, of course, multiple exceptions to this, but it can be quite common to find a long series of V/\V/\... etc. throughout.
Focusing solely on the minuscules, we find that many of the words also contain what appear to be diacritical marks; namely diaereses (e.g. ä) , tildes (e.g. ñ), and carons (e.g. č). At times, the diaeresis may contain up to three instead of just two raised dots. We should also take note of one of the most common formations in this glyph set, a character that resembles either a tilted ‘w’ or a cursive capital ‘E’ with its lower limb curving upward and partially enclosing a dot. It is tempting to see this glyph as a possible ampersand given how common it is, or as some prepositional word such as to, in, or, of. However, one can just as easily speculate that these do not represent words or letters at all, but instead may function as punctuation.
In the writing/linguistic section of the book, we are presented with a chart of both minuscules and majuscules. The number of minuscules is 50 in total, although their appearance in the actual script numbers slightly higher than this given that the chart appears incomplete (ditto for the majuscule set since I have counted 412 unique glyphs, 626 if I include all the special glyphs that precede a dash and only appear once).
There are some commonalities in the minuscules worth noting. The highest frequency single glyph would be what looks like a reposing E or a slightly tilted W with a dot inside it. Some have speculated this could be similar to an ampersand, or that it represents a preposition like to, or, of. It is also possible that it represents punctuation (this is not off the table since there seems to be a curious dearth of recognizable punctuation in the script––unless Serafini is consistently employing run-on sentences). The most common minuscule glyph-pairs begin with the single loop D-shaped glyph.
There is a considerable amount of looping, especially in the descenders of the glyphs.
The minuscule glyph set has far fewer variations than its larger cousin. It is vaguely reminiscent of standard cursive writing in that all glyphs are joined, and many of them are constructed of loops and tails. Most of the words end with a rising curled tail, often with a dot inside the curve. It is possible that Serafini has merged the function of the glyph-as-letter with that of punctuation. One may come to notice that there is very little evidence of recognizable punctuation save for the ending of a paragraph which commonly sports a dot followed by a second dot beneath a tilde. Other marks include a variation of the number system, mostly repeating series of unqualified I, V, and /\s. I say “unqualified” on account of these numerals appearing without the usual number set of 1-22 as it appears in the page numbers. There are, of course, multiple exceptions to this, but it can be quite common to find a long series of V/\V/\... etc. throughout.
Focusing solely on the minuscules, we find that many of the words also contain what appear to be diacritical marks; namely diaereses (e.g. ä) , tildes (e.g. ñ), and carons (e.g. č). At times, the diaeresis may contain up to three instead of just two raised dots. We should also take note of one of the most common formations in this glyph set, a character that resembles either a tilted ‘w’ or a cursive capital ‘E’ with its lower limb curving upward and partially enclosing a dot. It is tempting to see this glyph as a possible ampersand given how common it is, or as some prepositional word such as to, in, or, of. However, one can just as easily speculate that these do not represent words or letters at all, but instead may function as punctuation.
In the writing/linguistic section of the book, we are presented with a chart of both minuscules and majuscules. The number of minuscules is 50 in total, although their appearance in the actual script numbers slightly higher than this given that the chart appears incomplete (ditto for the majuscule set since I have counted 412 unique glyphs, 626 if I include all the special glyphs that precede a dash and only appear once).
Numerals
Gloss on numerals: As Derzhanski and others have determined, the number system is base-21. Yet, it is also riddled with a few exceptions as well as a lack of data which makes guessing future numbers difficult. It should be noted that despite some aesthetic similarities with some of the numbers in the text proper the in-text numbers do not necessarily follow the page count numerical system. There are also at least two other “numerical” systems in the Codex text proper that are composed of entirely different symbols, but ones that we lack any way of knowing what numbers they represent as opposed to the page number system which is countable.
The numerical system here is somewhat similar to the Roman numeral system. Whereas the Roman numerals only allow I, V, X, L, C, and M in any of its constructions, Serafini's numeral system possesses 21 distinct numerals, but its count above 21 involves I, V, and A-shapes. It should be noted that these shapes do not necessarily mean "21", and this can be seen as we extend the number sequence. In the Serafinian system, comparable to the Roman numerals, no digit can appear more than three times in a row.
*I am leaving out the additional pages in the Rizzoli edition (some of which substitute a few pages from the numerical sequence). Leaving aside the first 14 pages of new pages at the beginning, Serafini added a few more pages within the Codex text proper that do not appear in the original editions of the book. It is uncertain whether these were "cutting room floor" scraps, or new pages Serafini had constructed subsequent to the release of the prior editions.
NOTES:
1. The page numbering restarts in the second volume of the Codex (which starts with the "Anthropology" section). Codex A is shorter than Codex B by a few pages. The maximum page count for either of the volumes is 186.
2. The symbols for the page number 170 in Codex B is proper to form, whereas in Codex A the symbols are inverted.
3. The page subsequent to 170 (we will call it 170b) does not correspond to the numbering system. The numbering system resumes on page 172 (which we will take as 171 to keep with our count).
4. Some of the numerals are vertical "mirrors" of one another: 1 and 10, 2 and 9, 7 and 11, 13 and 14.
5. There appears to be an interesting rule to explain what happens at numbers 111, 132, 153, and 174. These do not follow the regular sequence since having two V-shaped or two A-shaped digits in contiguity is not allowed.
The Chart
Find below a comprehensive chart of the page count numerical system. Some qualifying notes apply here. Firstly, since the page count ends at 186, anything after this (in blue) is speculation. Secondly, for ease of display, I have used the font set that I constructed from my own hand drawings of Serafini's numerals, so it will not be absolutely perfect. Since I cannot embed the fonts on this website, I did so via a PDF document. The numbers in yellow represent unexplained deviations from what one might suppose would be the next in the numerical sequence.
The numerical system here is somewhat similar to the Roman numeral system. Whereas the Roman numerals only allow I, V, X, L, C, and M in any of its constructions, Serafini's numeral system possesses 21 distinct numerals, but its count above 21 involves I, V, and A-shapes. It should be noted that these shapes do not necessarily mean "21", and this can be seen as we extend the number sequence. In the Serafinian system, comparable to the Roman numerals, no digit can appear more than three times in a row.
*I am leaving out the additional pages in the Rizzoli edition (some of which substitute a few pages from the numerical sequence). Leaving aside the first 14 pages of new pages at the beginning, Serafini added a few more pages within the Codex text proper that do not appear in the original editions of the book. It is uncertain whether these were "cutting room floor" scraps, or new pages Serafini had constructed subsequent to the release of the prior editions.
NOTES:
1. The page numbering restarts in the second volume of the Codex (which starts with the "Anthropology" section). Codex A is shorter than Codex B by a few pages. The maximum page count for either of the volumes is 186.
2. The symbols for the page number 170 in Codex B is proper to form, whereas in Codex A the symbols are inverted.
3. The page subsequent to 170 (we will call it 170b) does not correspond to the numbering system. The numbering system resumes on page 172 (which we will take as 171 to keep with our count).
4. Some of the numerals are vertical "mirrors" of one another: 1 and 10, 2 and 9, 7 and 11, 13 and 14.
5. There appears to be an interesting rule to explain what happens at numbers 111, 132, 153, and 174. These do not follow the regular sequence since having two V-shaped or two A-shaped digits in contiguity is not allowed.
The Chart
Find below a comprehensive chart of the page count numerical system. Some qualifying notes apply here. Firstly, since the page count ends at 186, anything after this (in blue) is speculation. Secondly, for ease of display, I have used the font set that I constructed from my own hand drawings of Serafini's numerals, so it will not be absolutely perfect. Since I cannot embed the fonts on this website, I did so via a PDF document. The numbers in yellow represent unexplained deviations from what one might suppose would be the next in the numerical sequence.
Your browser does not support viewing this document. Click here to download the document.
Extending the Number Sequence Via Pattern Extension
The chart below makes an attempt to get beyond 186.
The chart below makes an attempt to get beyond 186.
Your browser does not support viewing this document. Click here to download the document.
The Cipher
The Cipher
To avoid the kind of wishful thinking that marks “Newbold Syndrome” in imposing meaning upon a text that it does not possess, before one can set out to accomplish the decryption of the Codex, a few ground rules might be in order.
Assuming that Luigi Serafini took two full years to complete the task of both illustrating and “writing” the Codex, that would mean roughly two months per section, or perhaps two pages a day. We have to subtract from this any time he might have spent researching to obtain some inspiration (perhaps the Voynich, or medieval codices) as well as the time it would have taken him to devise the ciphers for the numerals, majuscules, and minuscules. Given their nearly flawless execution and consistency, this might suggest that he practiced writing the cipher beforehand. Given that some of the illustrations are fairly complex, such a task represents a considerable time commitment. Two years does not seem like a lot of time, even if one does nothing else. As well, we still lack any certainty to confirm on the basis of proof that the cipher is legitimately so; it could be false.
Firstly, if we assume that the cipher is legitimate, we might find it to be a stronger assumption that the enciphering process did not involve a highly complex or multi-stage process. If the process of encipherment involves several steps to transform, say, a, into “A” by means of having to check each character against a table, and then to perform this over again with a different table, it is highly unlikely that Serafini would have had the time to encipher a book of this size in two years. Secondly, for Serafini to encipher at all, it must be based on a language he knows. Serafini may be highly knowledgeable in a variety of areas, especially with respect to art, design, and architecture, but it may not be likely that he is a seasoned linguist as well. This may seem to be a minor point, but in deciphering it makes all the difference: what is the root language or languages the encipherer is using? This will be narrowed by the number of languages the encipherer knows. What we do know is that Serafini knows Italian, some English, and perhaps French.
Since there are considerably fewer instances of majuscules than minuscules, it might be reasonable to assume that Serafini afforded himself a slightly more complex enciphering process, or he may have applied the same simpler encipherment to both sets. The only lead we have to depend on would be the numerals which function as base 21, but there is no way of confirming whether or not this will apply to the majuscules and minuscules any more than our own base 10 numeral system means we have only 10 letters in our alphabet, or that our alphabet is based on some power of 10.
If Serafini has enciphered the majuscules using a keyword, we would have to discover what that keyword is which could be something as basic as LUIGISERAFINI, SERAFINI, or LGSRFN. What is intriguing about the majority of the majuscule headers is that they are seemingly composed of a single word (with the exception of chapter headings). Either these represent nouns, as in entries in an encyclopaedia (such as “BOATS”, “FISH”, etc.) or they are compound words without spaces. If the latter, it would not be unreasonable to conjecture that the majuscules are solely composed of consonants with the vowels and spaces removed. If, for example, we employed a 21 letter alphabet and removed the vowels, we get BCDFGHLMNPQRSTVZ: 15 letters. Imagine for a moment that Serafini is enciphering from his native Italian for “ladybug” (coccinelle). We would be left with a plaintext of CCCNLL (incidentally, the page in the Codex with the ladybugs has seven, not six glyphs as in this example, and the preceding chapter-heading page has five, those five glyphs repeated in the seven glyph construction). If, for example, we had reason to suspect that Serafini’s keyword was LGSRFN, we would construct a Vigenere Square using just consonants. How do we connect the keyword and the plaintext to derive the ciphertext in order to confirm our hypothesis? We would look to the Serafinian glyphs themselves. The glyphs on this page are peepgpg [note: on my computer, I am able to use the Serafini font, but cannot do so here]. For the sake of this exercise, we would assign them Latin alphabet equivalents: BCCBDBD. We would then use our keyword LGSRFN on our 15 x 15 Vigenere square. The first glyph “B” appears in the row L as under column B as L, “C” in row G as H, and so on until we come up with LHTRHNN (the last glyph assuming that the keyword repeats LGSRFN). This method is entirely built on assumptions. Moreover, one of the difficulties in removing vowels from words is that the deciphering recipient will be presented with ambiguity as to what vowels to use, how many there are in the word, and where to place them. This is not a problem if Serafini did not intend for decipherment, simply constructing a one-way cipher. The subtraction of vowels, or the use of Vigenere Squares, is one of the few explanations as to why we are presented with the occasional word that has three of the same glyph in a row. However, this is not the only possible explanation.
Another possible complexity that Serafini might have added was in ciphering anagrams of the original plaintext. This does not present too difficult a problem if we are already certain of our keyword. So, if we were presented with a plaintext that read NSICTE on a page with bugs on it, then it is possible that we’d rearrange the letters to read INSECT. However, if there is a keyword, we do not know it. So, if we came up with a possible keyword that resulted in a plaintext that read PUMEZG, and we used the Caesar shift to come up with all 21 variations, NSICTE would be part of that group, but it would not make any more sense than PUMEZG, QVNFAH, or any other in our list. That is, even if we suspected that the plaintext was an anagram, it would mean exhausting the list of all 21 variations testing for anagrams. It is more likely that we would consider the keyword incorrect and try a different one.
If we do not know the keyword in a Vigenere Square cipher, this does not make it unbreakable. There are methods in cryptanalysis that use patterns where frequency analysis will not work. However, this works when there is a one-to-one correspondence between the cipher alphabet and a known alphabet. In the case of Serafini’s cipher, we are presented with 412 unique glyphs in the majuscule cipher alphabet. We might speculate that the same Latin alphabet letter is represented, say, twenty-one times in different cipher glyphs. A might be represented by 21 Serafini glyphs. However, we cannot necessarily test for this accurately since there are 412 glyphs, and for each Latin alphabet letter to be represented 21 times, that would mean we would need a Serafini alphabet of 441 glyphs, suggesting that Serafini did not have the occasion to use 29 of them. For all we know, Serafini might have decided to represent the letters of the Latin alphabet (if that is in fact the plaintext he enciphered) according to a different distribution. So, for example, he may have represented A with 47 different cipher glyphs while Q with only 12. And this is just one of hundreds if not thousands of possibilities that may have factored in the original encipherment. Here is a very basic starting list:
To avoid the kind of wishful thinking that marks “Newbold Syndrome” in imposing meaning upon a text that it does not possess, before one can set out to accomplish the decryption of the Codex, a few ground rules might be in order.
Assuming that Luigi Serafini took two full years to complete the task of both illustrating and “writing” the Codex, that would mean roughly two months per section, or perhaps two pages a day. We have to subtract from this any time he might have spent researching to obtain some inspiration (perhaps the Voynich, or medieval codices) as well as the time it would have taken him to devise the ciphers for the numerals, majuscules, and minuscules. Given their nearly flawless execution and consistency, this might suggest that he practiced writing the cipher beforehand. Given that some of the illustrations are fairly complex, such a task represents a considerable time commitment. Two years does not seem like a lot of time, even if one does nothing else. As well, we still lack any certainty to confirm on the basis of proof that the cipher is legitimately so; it could be false.
Firstly, if we assume that the cipher is legitimate, we might find it to be a stronger assumption that the enciphering process did not involve a highly complex or multi-stage process. If the process of encipherment involves several steps to transform, say, a, into “A” by means of having to check each character against a table, and then to perform this over again with a different table, it is highly unlikely that Serafini would have had the time to encipher a book of this size in two years. Secondly, for Serafini to encipher at all, it must be based on a language he knows. Serafini may be highly knowledgeable in a variety of areas, especially with respect to art, design, and architecture, but it may not be likely that he is a seasoned linguist as well. This may seem to be a minor point, but in deciphering it makes all the difference: what is the root language or languages the encipherer is using? This will be narrowed by the number of languages the encipherer knows. What we do know is that Serafini knows Italian, some English, and perhaps French.
Since there are considerably fewer instances of majuscules than minuscules, it might be reasonable to assume that Serafini afforded himself a slightly more complex enciphering process, or he may have applied the same simpler encipherment to both sets. The only lead we have to depend on would be the numerals which function as base 21, but there is no way of confirming whether or not this will apply to the majuscules and minuscules any more than our own base 10 numeral system means we have only 10 letters in our alphabet, or that our alphabet is based on some power of 10.
If Serafini has enciphered the majuscules using a keyword, we would have to discover what that keyword is which could be something as basic as LUIGISERAFINI, SERAFINI, or LGSRFN. What is intriguing about the majority of the majuscule headers is that they are seemingly composed of a single word (with the exception of chapter headings). Either these represent nouns, as in entries in an encyclopaedia (such as “BOATS”, “FISH”, etc.) or they are compound words without spaces. If the latter, it would not be unreasonable to conjecture that the majuscules are solely composed of consonants with the vowels and spaces removed. If, for example, we employed a 21 letter alphabet and removed the vowels, we get BCDFGHLMNPQRSTVZ: 15 letters. Imagine for a moment that Serafini is enciphering from his native Italian for “ladybug” (coccinelle). We would be left with a plaintext of CCCNLL (incidentally, the page in the Codex with the ladybugs has seven, not six glyphs as in this example, and the preceding chapter-heading page has five, those five glyphs repeated in the seven glyph construction). If, for example, we had reason to suspect that Serafini’s keyword was LGSRFN, we would construct a Vigenere Square using just consonants. How do we connect the keyword and the plaintext to derive the ciphertext in order to confirm our hypothesis? We would look to the Serafinian glyphs themselves. The glyphs on this page are peepgpg [note: on my computer, I am able to use the Serafini font, but cannot do so here]. For the sake of this exercise, we would assign them Latin alphabet equivalents: BCCBDBD. We would then use our keyword LGSRFN on our 15 x 15 Vigenere square. The first glyph “B” appears in the row L as under column B as L, “C” in row G as H, and so on until we come up with LHTRHNN (the last glyph assuming that the keyword repeats LGSRFN). This method is entirely built on assumptions. Moreover, one of the difficulties in removing vowels from words is that the deciphering recipient will be presented with ambiguity as to what vowels to use, how many there are in the word, and where to place them. This is not a problem if Serafini did not intend for decipherment, simply constructing a one-way cipher. The subtraction of vowels, or the use of Vigenere Squares, is one of the few explanations as to why we are presented with the occasional word that has three of the same glyph in a row. However, this is not the only possible explanation.
Another possible complexity that Serafini might have added was in ciphering anagrams of the original plaintext. This does not present too difficult a problem if we are already certain of our keyword. So, if we were presented with a plaintext that read NSICTE on a page with bugs on it, then it is possible that we’d rearrange the letters to read INSECT. However, if there is a keyword, we do not know it. So, if we came up with a possible keyword that resulted in a plaintext that read PUMEZG, and we used the Caesar shift to come up with all 21 variations, NSICTE would be part of that group, but it would not make any more sense than PUMEZG, QVNFAH, or any other in our list. That is, even if we suspected that the plaintext was an anagram, it would mean exhausting the list of all 21 variations testing for anagrams. It is more likely that we would consider the keyword incorrect and try a different one.
If we do not know the keyword in a Vigenere Square cipher, this does not make it unbreakable. There are methods in cryptanalysis that use patterns where frequency analysis will not work. However, this works when there is a one-to-one correspondence between the cipher alphabet and a known alphabet. In the case of Serafini’s cipher, we are presented with 412 unique glyphs in the majuscule cipher alphabet. We might speculate that the same Latin alphabet letter is represented, say, twenty-one times in different cipher glyphs. A might be represented by 21 Serafini glyphs. However, we cannot necessarily test for this accurately since there are 412 glyphs, and for each Latin alphabet letter to be represented 21 times, that would mean we would need a Serafini alphabet of 441 glyphs, suggesting that Serafini did not have the occasion to use 29 of them. For all we know, Serafini might have decided to represent the letters of the Latin alphabet (if that is in fact the plaintext he enciphered) according to a different distribution. So, for example, he may have represented A with 47 different cipher glyphs while Q with only 12. And this is just one of hundreds if not thousands of possibilities that may have factored in the original encipherment. Here is a very basic starting list:
- The glyphs could represent bigrams
- The glyphs could be compound phonograms
- The glyphs might all be vowels, consonants, or some shortened combination of the two
- The glyphs might mean nothing at all. (This is my conclusion)
- The cipher could rely on a keyword, but there seems to be no way of reverse engineering the cipher to produce it.
- The cipher could contain nulls.