140 likes | 244 Views
Cryptography I. Lecture 3 Dimitrios Delivasilis Department of Information and Communication Systems Engineering University of Aegean. Some modular arithmetic. When we write r = (x mod m)
E N D
Cryptography I Lecture 3 Dimitrios Delivasilis Department of Information and Communication Systems Engineering University of Aegean
Some modular arithmetic • When we write r = (x mod m) We mean that r is the remainder when x is divided by m. In the above equation r satisfies the following two constraints: • x = km +r, for some arbitrary k • 0 ≤ r ≤ m
Polyalphabetic Ciphers • A polyalphabetic cipher is just a set of monoalphabetic ciphers applied sequentially • A monoalphabetic cipher enciphers each character ci in the plaintext in exactly the same way as illustrated below: Plaintext co c1 … cp-1 cp cp+1 Ciphertext f(co) f(c1) … f(cp-1) f(cp) f(cp+1) • A polyalphabetic cipher cycles through a sequence of distinct transformations, with the result that plaintext characters are enciphered in several different ways as below: Plaintext co c1 … cp-1 cp cp+1 Ciphertext f0(co) f1(c1) … fp-1(cp-1) f0cp f1(cp+1)
Attacks on Polyalphabetic Ciphers… • Index of Coincidence • Natural language has an uneven distribution of character probabilities. • The use of polyalphabetic ciphers tends to smooth out the distribution of characters in the ciphertext • If a polyalphabetic cipher has a long period and if the individual substitution alphabets are chosen randomly, then all letters in the ciphertext will occur with equal frequencies • One way to measure evenness of a distribution is to calculate the sums of the squares of the probabilities of the characters where p(ai) is the probability of character ai
… Attacks on Polyalphabetic Ciphers… • More on the Index of Coincidence • For an even distribution with n characters, p(ai) = 1/n and this sum will be: • This value will range from 1/n to 1, with a higher value indicating a less even distribution. • This values is the probability that two characters are the same if they are chosen at random according to the distribution. It is called the index of coincidence and is written Ic. • For a 26 letter alphabet an even distribution will result in a value of 1/26 = 0.038. The actual frequencies with which the 26 letters of the alphabet actually occur in English approximate 0.065
… Attacks on Polyalphabetic Ciphers… • Obtaining the Index of Coincidence • Suppose c(ai) is the number of occurrences of a character ai in a text. Then we can estimate • With • Where c is the total number of characters in the text
… Attacks on Polyalphabetic Ciphers… • More on Obtaining the Index of Coincidence • The value of the Index of Coincidence can be used to estimate the period of a polyalphabetic cipher • Given sufficiently long piece of ciphertext, the probabilities of the individual characters can be estimated from their frequencies and therefore the index of coincidence can be calculated • If the plaintext is known to be English and the value we obtain is close to 0.065, then we can safely assume that a short period cipher was used • If the value is close to 1/n, then we can guess that a long period polyalphabetic cipher was used.
… Attacks on Polyalphabetic Ciphers… • Lets carefully consider the following example Plaintext: foraciphertextonlyattackweneedsomeknowledgeofthestatisticsofthesourceoftheplaintextthesourcemightbecomputerdataEnglishtextimagesdigitsedvoicesignalsandsoonitmightevenbepositionalinformationbroadcastbymilitarynavigationsatellitestheamericanmilitaryhavesuchasysteminwhichtheinformationallowingacoarsepositiontobecalculatedisbroadcastplainandextrainformationallowinganaccuratefixisbroadcastencrypted -Total of 397 characters -Character frequencies: a b c d e f g h i j k l m n o p q r s t u v w x y 41 7 19 12 39 8 10 14 40 0 2 17 12 27 31 7 0 18 25 42 6 4 5 5 6 -Summing up the square of frequencies we obtain 10743 -Dividing by 3972 = 157609 we get 0.068 -Then encipher the plaintext using a) polyalphabetic cipher of period 3, b) polyalphabetic cipher with period six and c) polyalphabetic cipher with period 10. - The index of coincidence became: 0.049, 0.046 and 0.042 respectively
… Attacks on Polyalphabetic Ciphers… • Estimating Period with the Index of Coincidence • We cannot make a sharp estimate of the period of polyalphabetic cipher from the letter frequencies in a ciphertext • Lets assume that a ciphertext was created by a polyalphabetic cipher with period m. The ciphertext can then be rearranged into m columns, one for each substitution alphabet • The probability that two characters chosen from the same column are the same is 0.065 • The probability that two characters chosen from different columns is the same is just 1/26 = 0.038
… Attacks on Polyalphabetic Ciphers… • Estimating Period with the Index of Coincidence (Cntd) • If we choose two characters from anywhere in the text, the probability that they belong to the same column is 1/m • The probability that they are in different columns is (m-1)/m • Therefore the probability that they are the same is:
… Attacks on Polyalphabetic Ciphers… • Estimating Period with the Index of Coincidence (Cntd) • We can estimate Ic directly from the character frequencies in the text and thus obtain m, since • Thus finally we get:
… Attacks on Polyalphabetic Ciphers… • Determining the period with the index of coincidence In the previous example we obtained Ic= 0.068, 0.049, 0.046 and 0.042 for m=1, 3, 6, and 10 respectively. Pluggin the values for Ic in the last equation we get estimates for the period of 0.9, 2.45, 3.375 and 6.75 respectively. These estimates are approximately correct. This illustrates the importance of having sufficient ciphertext, and of not relying in a single method for estimating the period.
… Attacks on Polyalphabetic Ciphers… • Confirmation methods for the estimated period: • The method of Displacement According to this method the text is compared against a shifted version of itself and the number of coincidences is recorded for a range of displacements. If the number of coincidences is plotted against the displacement, then there will be peaks of multiples of the cipher period. Thus by inspection, or by calculating the autocorrelation function, or by using the Fourier spectrum we may be able to find a frequency component with period equal to the period of the cipher • The Kasiski Test In the Kasiski test the cryptanalyst looks for repeated sequences of letters in the ciphertext. Any pair of sequences which are at positions in the plaintext which are a multiple of the cipher period apart will be enciphered with the same sequence of substitution alphabets. If such a pair of sequences are identical, then they will be enciphered identically. To estimate the period we look at factors of distances between identical sequences of letters. Displacements with the largest number of occurrences of identical n-grams will tend to have the cipher period as a factor.
… Attacks on Polyalphabetic Ciphers • Exaple: The Kasiski test Suppose that the cipher period were 9 and the plaintext were: the time they ate them As it happens, the three instances of the are separated by exaclty 9 characters as shown in the diagram below: Each of the three instances of the will be enciphered identically