390 likes | 416 Views
Encoding and Ciphering. Tomáš Vaníček Faculty of Civil Engeneering CTU Thákurova 7, Praha-Dejvice, B407 vanicek @fsv.cvut.cz. Ciphering (symetrical). Eva (Enemy). Alice. ciphering. deciphering. Bob. key. Encoding. Message disturbing. Alice. encoding. decoding. Bob.
E N D
Encoding and Ciphering Tomáš Vaníček Faculty of Civil Engeneering CTU Thákurova 7, Praha-Dejvice, B407 vanicek@fsv.cvut.cz
Ciphering (symetrical) Eva (Enemy) Alice ciphering deciphering Bob key
Encoding Message disturbing Alice encoding decoding Bob Automatical repair or at least alert of transmission mistake
Alphabet • Finite set of symbols • A={0,1}, • A={أبدفجحئكلمنوقرستثز} • A={AÁBCČDEÉĚFGHIÍJKLMNŇOÓPQRŘSŠTŤUÚVWXYÝZŽ} • A={ABCDEFGHIJKLMNOPQRSTUVWXYZ} (26 symbols) • A+ - Set of all words (sequences of symbols from A). • A* -Set of all sequences from A with the empty word.
Cipher • Cryptograficaltransformation (cipher) istheinjectionmapping Φ: A*x K B*, K isthekey set
Tomorrow morning we will cross the Rubicon. Wrpruurz pruqlqj zh zloo furvv wkh Uxelfrq. Caesar cipherf(x)=x+k mod NKey K = 3 ABCDEFGHIJKLMNOPQRSTUVWXYZ DEFGHIJKLMNOPQRSTUVWXYZABC
Multiplicative cipherf(x)=x*k mod NKLÍČ K = 3 ABCDEFGHIJKLMNOPQRSTUVWXYZ ADGJMPSVYBEHKNQTWZCFILORUX
Multiplicative „cipher“key K=2 A 0 → 0 A B 1 → 2 C … N 13 → 26 → 0 A O 14 → 28 → 2 C Is not a bijection
Multiplicative cipher • If d(K,N)-1 then there exist only one L such that K*L= 1 mod N. • For K=3 and N=26 it is L=9. • K is ciphering key and L is deciphering key. • For example w=22 is ciphered to 22*3 mod 26= 14 = O • a deciphered: 14*9 mod 26 = 126 mod 26 = 22 = w
General Affine Cipher • f(x) = K*x + P mod N, d(K,N)=1 • Ciphering key is a pair K,P • Deciphering key is a pair L,Q,where L is the only number such that K * L = 1 mod N and Q= 26-P mod N.
General Monoalphabetical Cipher • Šciphering key is the complete function (table) of the special letter images: ABCDEFGHIJKLMNOPQRSTUVWXYZ VMAIVLDRHQCSYKBXGOTZPEUVFN
Tommorrow morning we will cross the Rubicon. Tbssbiibq sbikxkz qw qxuu mibcc oew Irgxmbk. General Monoalphabetical Cipher
Frequence analysis(in %) Letter EN FR GE CZ SK A 7,96 7,68 5,52 8,99 9,49 B 1,60 0,80 1,56 1,86 1,90 C 2,84 3,32 2,94 3,04 3,45 D 4,01 3,60 4,91 4,14 4,09 E 12,86 17,76 19,18 10,13 9,16 F 2,62 1,06 1,96 0,330,31 G 1,99 1,10 3,60 0,48 0,40 H 5,39 0,64 5,02 2,06 2,35 I 7,77 7,23 8,21 6,92 6,81 J 0,16 0,19 0,16 2,10 2,12 K 0,41 0,00 1,33 3,44 3,80 L 3,51 5,89 3,48 4,20 4,56
Frequence analysis(in %) Letter EN FR GE CZ SK M 2,43 2,72 1,69 2,99 2,97 N 7,51 7,61 10,20 6,64 6,34 O 6,62 5,34 2,14 8,39 9,34 P 1,81 3,24 0,54 3,54 2,87 Q 0,17 1,34 0,01 0,00 0,00 R 6,83 6,81 7,01 5,33 5,12 S 6,62 8,23 7,07 5,74 5,94 T 9,72 7,30 5,86 4,98 5,06 U 2,48 6,05 4,22 3,94 3,70 V 1,15 1,27 0,84 4,50 4,85 W 1,80 0,00 1,38 0,06 0,06 X 0,170,54 0,00 0,04 0,03 Y 1,52 0,210,00 2,72 2,57 Z 0,050,07 1,17 3,44 2,72
E.A. Poe: The Golden Bug 53‡‡†305))6*;4826)4‡.)4‡);806*;48†8π60))85;1‡(;:‡*8†83(88)5*†;46(;88*96*?;8)*‡(;485);5*†2:*‡(;4956*2(5*-4)8 π8*;4069285);)6†8)4‡‡;1(‡9;48081;8:8‡1;48†85;4)485†528806*81(‡9;48;(88;4(‡?34;48)4‡;161;:188;‡?;
Frequence Analysis 53‡‡†305))6*;4826)4‡.)4‡);806*;48†8π60))85;1‡(;:‡*8†83(88)5*†;46(;88*96*?;8)*‡(;485);5*†2:*‡(;4956*2(5*4)8 π8*;4069285);)6†8)4‡‡;1(‡9;48081;8:8‡1;48†85;4)485†528806*81(‡9;48;(88;4(‡?34;48)4‡;161;:188;‡?; 5 12x 6 11x . 1x ? 3x 3 4x * 13x π 2x - 1x ‡ 16x ; 26x 1 6x † 8x 4 19x ( 10x 0 6x 8 33x : 4x ) 16x 2 5x 9 5x -
The Side Channel By the signature (the little goat = kid = Capitain Kid) there is an information that the language of the text is English
Most common english letters Most common letters in the text E 12,86% T 9,72% A 7,96% I 7,77% N 7,51% O 6,56% S 6,56% • 33x ; 26x • 19x ‡ 16x The hypothesis: 8 = E
Confirmation In english there is very common bigram EE. V the text there is the bigram 88 5 times. 53‡‡†305))6*;4826)4‡.)4‡);806*;48†8π60))85;1‡(;:‡*8†83(88)5*†;46(;88*96*?;8)*‡(;485);5*†2:*‡(;4956*2(5*-4)8 π8*;4069285);)6†8)4‡‡;1(‡9;48081;8:8‡1;48†85;4)485†528806*81(‡9;48;(88;4(‡?34;48)4‡;161;:188;‡?;
Try to substitute e for 8 53‡‡†305))6*;4e26)4‡.)4‡);e06*;4e†eπ60))e5;1‡(;:‡*e†e3(ee)5*†;46(;ee*96*?;e)*‡(;4e5);5*†2:*‡(;4956*2(5*-4)e πe*;40692e5);)6†e)4‡‡;1(‡9;4e0e1;e:e‡1;4e†e5;4)4e5†52ee06*e1(‡9;4e;(ee;4(‡?34;4e)4‡;161;:1ee;‡?;
Continue In English there is common trigram THE In the text there is7xthe trigram;48, so ;4e More ; is the second most common symbol, corresponding to the letter T. Try; = t, 4 = h 53‡‡†305))6*;4e26)4‡.)4‡);e06*;4e†eπ60))e5;1‡(;:‡*e†e3(ee)5*†;46(;ee*96*?;e)*‡(;4e5);5*†2:*‡(;4956*2(5*-4)e πe*;40692e5);)6†e)4‡‡;1(‡9;4e0e1;e:e‡1;4e†e5;4)4e5†52ee06*e1(‡9;4e;(ee;4(‡?34;4e)4‡;161;:1ee;‡?;
So we have 53‡‡†305))6*the26)h‡.)h‡)te06*the†eπ60))e5t1‡(t:‡*e†e3(ee)5*†th6(tee*96*?te)*‡(the5)t5*†2:*‡(th956*2(5*-h)e πe*th0692e5)t)6†e)h‡‡t1(‡9the0e1te:e‡1the†e5th)he5†52ee06*e1(‡9thet(eeth(‡?3hthe)h‡t161t:1eet‡?t
Continue 53‡‡†305))6*the26)h‡.)h‡)te06*the†eπ60))e5t1‡(t:‡*e†e3(ee)5*†th6(tee*96*?te)*‡(the5)t5*†2:*‡(th956*2(5*-h)e πe*th0692e5)t)6†e)h‡‡t1(‡9the0e1te:e‡1the†e5th)he5†52ee06*e1(‡9thet(eeth(‡?3hthe)h‡t161t:1eet‡?t Another common symbol is‡. In the text there is 2 times the bigram‡‡. Seems to be a letter O 53oo†305))6*the26)ho.)ho)te06*the†eπ60))e5t1o(t:o*e†e3(ee)5*†th6(tee*96*?te)*o(the5)t5*†2:*o(th956*2(5*-h)e πe*th0692e5)t)6†e)hoot1(o9the0e1te:eo1the†e5th)he5†52ee06*e1(o9thet(eeth(o?3hthe)hot161t:1eeto?t
Continue 53oo†305))6*the26)ho.)ho)te06*the†eπ60))e5t1o(t:o*e†e3(ee)5*†th6(tee*96*?te)*o(the5)t5*†2:*o(th956*2(5*-h)e πe*th0692e5)t)6†e)hoot1(o9the0e1te:eo1the†e5th)he5†52ee06*e1(o9thet(eeth(o?3hthe)hot161t:1eeto?t We can guess words thirteen andthe tree, so 6 = i and ( = r • 53oo†305))i*the2i)ho.)ho)te0i*the†eπi0))e5t1ort:o*e†e3ree)5*†thirtee*9i*?te)*orthe5)t5*†2:*orth95i*2r5*-h)e πe*th0i92e5)t)i†e)hoot1ro9the0e1te:eo1the†e5th)he5†52ee0i*e1ro9thetreethro?3hthe)hot1i1t:1eeto?t
Continue • 53oo†305))i*the2i)ho.)ho)te0i*the†eπi0))e5t1ort:o*e†e3ree)5*†thirtee*9i*?te)*orthe5)t5*†2:*orth95i*2r5*-h)e πe*th0i92e5)t)i†e)hoot1ro9the0e1te:eo1the†e5th)he5†52ee0i*e1ro9thetreethro?3hthe)hot1i1t:1eeto?t Antoher common bigram is )). Corresponding to the letter S and bigram SS. We can also guess the word through, so ? = u, 3 = g • 5goo†g05ssi*the2isho.shoste0i*the†eπi0sse5t1ort:o*e†egrees5*†thirtee*9i*utes*orthe5st5*†2:*orth95i*2r5*-hse πe*th0i92e5stsi†eshoot1ro9the0e1te:eo1the†e5thshe5†52ee0i*e1ro9thetreethroughtheshot1i1t:1eetout
Continue • 5goo†g05ssi*the2isho.shoste0i*the†eπi0sse5t1ort:o*e†egrees5*†thirtee*9i*utes*orthe5st5*†2:*orth95i*2r5*-hse πe*th0i92e5stsi†eshoot1ro9the0e1te:eo1the†e5thshe5†52ee0i*e1ro9thetreethroughtheshot1i1t:1eetout • 5 is common symbol but never in bigram 55, corresponding to A • In the text begining 5 = a,† = d, 0 = l, (a good glass) could be guessed • agoodglassi*the2isho.shosteli*thedeπilsseat1ort:o*edegreesa*dthirtee*9i*utes*ortheasta*d2:*orth9ai*2ra*-hse πe*thli92eastsideshoot1ro9thele1te:eo1thedeathsheada2eeli*e1ro9thetreethroughtheshot1i1t:1eetout
Continue • agoodglassi*the2isho.shosteli*thedeπilsseat1ort:o*edegreesa*dthirtee*9i*utes*ortheasta*d2:*orth9ai*2ra*-hse πe*thli92eastsideshoot1ro9thele1te:eo1thedeathsheada2eeli*e1ro9thetreethroughtheshot1i1t:1eetout Other guessing • * = n, 2 = b, . = p, π = v, 1 = f • agoodglassinthebishopshostelinthedevilsseatfort:onedegreesandthirteen9inutesnortheastandb:north9ainbran-hse venthli9beastsideshootfro9thelefte:eofthedeathsheadabeelinefro9thetreethroughtheshotfif1t:feetout
Finishing • agoodglassinthebishopshostelinthedevilsseatfort:onedegreesandthirteen9inutesnortheastandb:north9ainbran-hse venthli9beastsideshootfro9thelefte:eofthedeathsheadabeelinefro9thetreethroughtheshotfif1t:feetout • : = y, 9 = m, - = c • agoodglassinthebishopshostelinthedevilsseatforty onedegreesandthirteen minutesnortheastandby north mainbranchseventhlimbeastsideshootfrom thelefteyeofthedeathsheadabeelinefrom thetreethroughtheshotfif1ty feetout • And the treasure cold be found Frequence analysis(in %)
The method to determine wheather the text was ciphered by a monoalphabetical cipher and even to determine the original language of the text without enciphering it. Coincidence Index
Jen sloupce jsou přeházené Jak to vyjádřit číselně? Nabízí se rozptyl veličiny, tedy průměrná odchylka od střední hodnoty Graf vypadá pořád stejně
Var (X) = E (X - E(X))2 Variance
n*Var (p) = ∑(p(i)-1/n)2 = = ∑p(i)2 - ∑2*p(i)/n + ∑1/n2 = = ∑p(i)2 - 2/n + 1/n= = ∑p(i)2 - 1/n For frequency Analysis -=
IC(T) = ∑p(i)2 = n*var(T)+1/n Greater or equal 1/n = 1/26 = 0,03846. Close to a value 0,03846 for random text. Invariant for monoalphabetial cipher. Coincidence Index
Coincidence Indeces for Languages • CZ 0,0577 • SK 0,0581 • EN 0,0676 • FR 0,0801 • GE 0,0824 • IT 0,0754 • ES 0,0769 • RU 0,0470 • Random text 0,0385