Substitution cipher#Homophonic substitution

The homophone encryption ( from Ancient Greek ὅμος homos "same" and φωνή phone " voice " = " same sounding ") is a already in the 17th century, widespread monoalphabetic encryption method in which, in contrast to simple mono substitution cipher the plaintext characters (mostly: letters) can be substituted by several (different ) ciphertext characters.

The main weakness of the simple mono alphabetic substitution is that each plaintext letter is always encrypted by only a single ciphertext character. The resulting ciphertext is therefore vulnerable to statistical attack. For example, satisfies a simple frequency count of ciphertext characters to identify the most common in most languages ​​letters E (frequency in German about 17.7 %) quickly.

This attack works against the homophone encryption by allowing several substitutes for frequently used letters such as E or N,. Conversely, formulated from the perspective of the ciphertext, different ciphertext characters, the encryption of the same plaintext letter mean (hence the name homophonic ), which significantly complicates the unauthorized decryption of the ciphertext. The homophone encryption thus represents a cryptographic improvement of the simple mono alphabetic substitution method, and is still easier to handle than a polyalphabetic substitution, in which several different secret alphabets are used.

A contrasting method of encryption is the homophonic polyphonic encryption.

Example

As with all the mono-alphabetic substitution method, only a single fixed substitution alphabet for encryption and decryption is also used in the homophonic encryption. To achieve this goal, namely the leveling of the different frequencies of the plaintext letter can be assigned, for example, each letter of the alphabet as many ciphertext characters such as its relative frequency percentage equal, resulting in a ciphertext alphabet of 100 characters. The typical frequencies of letters in the German language are illustrated in the following diagram:

If we now form the 26 letters of the alphabet on 100 secret characters that in the simplest case on the numbers 00 to 99, and in a way that the A six secret characters, the B two, the C two, D are assigned to five, and so on, as occurs in the ciphertext any (secret ) number with a mean frequency of 1%. A frequency analysis of single characters now yields no more starting points for deciphering.

To view the text yet to crack, the attacker must now apply sophisticated methods. To that end, instead of individual characters ( monograms ) extend the analysis to digrams ( character pairs ), trigrams or tetragrams. Possible targets are characteristic bigrams such as CH, CK or QU and the reverse EN and NE or ER and RE. However, therefore requires significantly longer texts. Sufficiently short, homophonic encrypted texts ( less than eighty letters) are fairly well protected against unauthorized decoding.

397570
de