Kasiski examination

The Kasiski test is new in cryptography tools for deciphering ciphers that have been created with the Vigenere method. It can be used, the length of the keyword used determine.

History

In 1854 succeeded the Englishman Charles Babbage ( 1791-1871 ) to decipher a Vigenere - encoded text. However, he kept his method secret. Published in 1863 by the Prussian Infantry Major Friedrich Wilhelm Kasiski ( 1805-1881 ) in the book " The Secret codes and the Dechiffrir Art" this method, which he invented independently by Babbage. In his honor is called Kasiski test the method.

General Procedure

Given the cryptogram, a Vigenere encrypted text. First, we searched the ciphertext by letter sequences of length 3 or more that occur more than once. Then, it determines the distance between each two same consequences, that is, you count the letters from the first letter of the first episode ( and including) the first letter of the second series (exclusive). Thus, the procedure with all the consequences found and writes the distances on. This gives a list of natural numbers. These will now be decomposed into prime factors. Similar divider can thus be found quickly. Random resulting matches are then also easily recognizable because they fall out of line. However, the exact key length is not known, because the Kasiski assay provides only multiples of the key length. For accurate observation but can the Friedman test are used.

Idea of ​​Kasiski tests

Why delivers the Kasiski test fairly reliable statements about the length of the keyword? Consider the following ciphers:

The plaintext (1st line ) is with keyword PLUTO (length 5) Vigenere coded. The ciphertext is on the 3rd line.

THE PLAIN TEXT IS TO SECRET TEXT                     PLU PLU TOPLUTOP LUTO TOPLUTOPLU                     SPL DZPCNXLI HCKR OFG ZSWPCFHTIN In plain text occurs twice the string TEXT. Nevertheless, the corresponding strings differ in the ciphertext. The reason for this is that TEXT is the first time with UTOP, the second time but coded with OPLU. This is because the distance between TEXT and TEXT is 17 letters happens. But the key word has 5 letters, and because 5 is not a divisor of 17, both passages are not encoded with the same part of the key word, so also not the same letter sequences are to be expected in the ciphertext. Let's change the example to be a little small.

THE PLAIN TEXT WILL SECRET TEXT                     PLU TOPLUTOP LUTOP LUTOPLUTOP                     SPL DZPCNXLI HYKRT RYASXXNXLI This time TEXT is encrypted twice with UTOP; therefore also agree the consequences cryptogram. If one determines also the distance between TEXT and TEXT, to get to 15, a multiple of 5, the length of the keyword. In summary, one finds: Same letter strings (words, syllables, word stems, etc. ) result only of letters in the same cryptogram, if the distance between them is a multiple of the length of the keyword. Or in other words: If the cryptogram a letter sequence twice, and was with her encrypted the same word as the distance between the two sequences is a multiple of the length of the keyword. When Kasiski test is sought in the cryptogram by the same letter sequences. It is assumed now that they encode the same word. Is that right, then the distance is a multiple of the length of the keyword. But was not the same word encrypted, the distance is not a multiple of the length of the keyword, and the two points in the ciphertext are only accidentally equal. Of course, one does not recognize immediately whether " random" the same string is created, or whether in fact the same word was encrypted. Therefore, common factors are sought at the end to find the " inappropriate " distances. Of course, it happens especially with short sequences that occur twice, though not the same word was encrypted. This is also the reason why you do not usually examined according to the same sequences of length 2. The probability that the letter strings in plain text does not match, is simply too great.

Examples

It should be given to the following Vigenere - encrypted ciphertext.

SPL DZPCNXLI HYKRT RYASXXNXLI The result NXLI comes in the ciphertext twice. The distance between these two passages is 15 characters. 15 = 3x5. Assuming that it is not at random occurrence, we can say that the same word (or syllable, letters, etc.) is encrypted. It will take here so that the keyword has length 3, 5 or 15.

Of course, more accurate statements about the length of the keyword can be taken for longer ciphertexts. The reasons are mainly: There are several sequences of letters before twice. A sequence of letters (especially for frequently occurring words, such as articles, pronouns, conjunctions ) comes even three times or even more often in the cryptogram.

Given the following Vigenere - encrypted ciphertext is ( Encrypts was 1.Mose, Chapter 1, verse 1-4 with the keyword OLD TESTAMENT (14 letters) ). With the Kasiski test the length of the keyword is to be determined.

AXTRX TRYLC TYSZO EMLAF QWEUZ HRKDP NRVWM WXRPI                     JTRHN IKMYF WLQIE NNOXW OTVXB NEXRK AFYHW KXAXF                     QYAWD PKKWB WLZOF XRLSN AAWUX WTURH RFWLL WWKYF                     WGAXG LPCTG ZXWOX RPIYB CSMYF WIKPA DHYBC SMYFW                     KGMTE EUWAD LHSLP AVHFK HMWLK How Do I Find the same text strings of length at least 3, mark it and determine distances.

AXTRX TRYLC TYSZO EMLAF QWEUZ HRKDP NRVWM WXRPI                     JTRHN IKMYF WLQIE NNOXW OTVXB NEXRK AFYHW KXAXF                     QYAWD PKKWB WLZOF XRLSN AAWUX WTURH RFWLL WWKYF                     WGAXG LPCTG ZXWOX RPIYB CSMYF WIKPA DHYBC SMYFW                     KGMTE EUWAD LHSLP AVHFK HMWLK                     XTR: distance 3                     XRPI: distance 98                     YFW: distance 70                     YBCSMYFW: distance 14 Disassembly of the distances into prime factors.

3 = 3                   98 = 2 x 7 x 7                   70 = 2 x 5 x 7                   14 = 2 x 7 evaluation

As you can see from the prime factorization, the distances are all (except the first ) are multiples of 14 The distance 3 is probably a coincidence. This leads to the following assumptions for the key word length: 2, 7 or 14

467571
de