670 likes | 862 Views
CIT 380: Securing Computer Systems. Classical Cryptography. Overview. Modular Arithmetic Review What is Cryptography? Transposition Ciphers Substition Ciphers Cæsar cipher Vigènere cipher Cryptanalysis: frequency analysis Block Ciphers DES. Modular Arithmetic. Congruence
E N D
CIT 380: Securing Computer Systems Classical Cryptography CIT 380: Securing Computer Systems
Overview • Modular Arithmetic Review • What is Cryptography? • Transposition Ciphers • Substition Ciphers • Cæsar cipher • Vigènere cipher • Cryptanalysis: frequency analysis • Block Ciphers • DES CIT 380: Securing Computer Systems
Modular Arithmetic Congruence • a = b (mod N) iff a = b + kN • Equivalently, a = b (mod N) iff N / (a – b) • ex: 37=27 mod 10 b is the residue of a, modulo N • Ints 0..N-1 are complete set of residues mod N CIT 380: Securing Computer Systems
Laws of Modular Arithmetic • (a + b) mod N = (a mod N + b mod N) mod N • (a - b) mod N = (a mod N - b mod N) mod N • ab mod N = (a mod N)(b mod N) mod N • a(b+c) mod N = ((ab mod N)+(ac mod N)) mod N CIT 380: Securing Computer Systems
What is Cryptography? Cryptography: The art and science of keeping messages secure. Cryptanalysis: the art and science of decrypting messages. Cryptology: cryptography + cryptanalysis CIT 380: Securing Computer Systems
Plaintext Encryption Procedure Ciphertext Terminology • Plaintext: message to be encrypted. Also called cleartext. • Encryption: altering a message to keep its contents secret. • Ciphertext: encrypted message. CIT 380: Securing Computer Systems
History of Cryptography Egyptian hieroglyphics ~ 2000 B.C.E. • Cryptic tomb enscriptions for regality. Spartan skytale cipher ~ 500 B.C.E. • Wrapped thin sheet of papyrus around staff. • Messages written down length of staff. • Decrypted by wrapped around = diameter staff. Cæsar cipher ~ 50 B.C.E. • Simple alphabetic substitution cipher. al-Kindi ~ 850 C.E. • Cryptanalysis using letter frequencies. CIT 380: Securing Computer Systems
History of Cryptography Alberti’s polyalphabetic cipher 1467 Decryption of Zimmerman telegram 1917 • Leads US into World War I Japanese Purple Machine cracked 1937 • US breaks rotor machine for highest secrets. German Enigma machine cracked 1933-45 • Initially broken by Polish mathematician Rejewski • Variants broken at Bletchley Park in UK • Colossus, world’s 1st electronic computer. CIT 380: Securing Computer Systems
Cryptosystem Formal Definition 5-tuple (E, D, M, K, C) • M set of plaintexts • K set of keys • C set of ciphertexts • E set of encryption functions e: M KC • D set of decryption functions d: C KM CIT 380: Securing Computer Systems
Example: Cæsar cipher Letter shifting cipher (A=>D, B=>E, C=>F, …) 5-tuple • M = { all sequences of letters } • K = { i | i is an integer and 0 ≤ i ≤ 25 } • E = { Ek | kK and for all letters m, Ek(m) = (m + k) mod 26 } • D = { Dk | kK and for all letters c, Dk(c) = (26 + c – k) mod 26 } • C = M History: Cæsar’s key was 3. CIT 380: Securing Computer Systems
Example: Cæsar cipher • Plaintext is HELLO WORLD • Change each letter to the third letter following it (X goes to A, Y to B, Z to C) • Key is 3, usually written as letter ‘D’ • Ciphertext is KHOOR ZRUOG CIT 380: Securing Computer Systems
A Transposition Cipher Rearrange letters in plaintext. Example: Rail-Fence Cipher • Plaintext is HELLO WORLD • Rearrange as H L O O L E L W R D • Ciphertext is HLOOL ELWRD CIT 380: Securing Computer Systems
Cryptosystem Security Dependencies • Quality of shared encryption algorithm E • Secrecy of key K CIT 380: Securing Computer Systems
Cryptanalysis Goals • Decrypt a given message. • Recover encryption key. Adversarial models vary based on • Type of information available to adversary • Interaction with cryptosystem. CIT 380: Securing Computer Systems
Cryptanalysis Adversarial Models • ciphertext only: adversary has only ciphertext; goal is to find plaintext, possibly key. • known plaintext: adversary has ciphertext, corresponding plaintext; goal is to find key. • chosen plaintext: adversary may supply plaintexts and obtain corresponding ciphertext; goal is to find key. CIT 380: Securing Computer Systems
Classical Cryptography Sender & receiver share common key • Keys may be the same, or trivial to derive from one another. • Sometimes called symmetric cryptography. CIT 380: Securing Computer Systems
Substitution Ciphers Substitute plaintext chars for ciphered chars. • Simple: Always use same substitution function. • Polyalphabetic: Use different substitution functions based on position in message. CIT 380: Securing Computer Systems
Cryptanalysis of Cæsar Cipher Exhaustive search • If the key space is small enough, try all possible keys until you find the right one. • Cæsar cipher has 26 possible keys. CIT 380: Securing Computer Systems
General Simple Substitution Cipher Key Space: All permutations of alphabet. Encryption: • Replace each plaintext letter x with K(x) Decryption: • Replace each ciphertext letter y with K-1(y) Example: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z K= F U B A R D H G J I L K N M P O S Q Z W X Y V T C E CRYPTO BQCOWP CIT 380: Securing Computer Systems
General Substitution Cryptanalysis Exhaustive search impossible • Key space size is 26! =~ 4 x 1026 • Historically thought to be unbreakable. • Yet people solve them as newspaper puzzles every day… Solution: frequency analysis. Lesson: A large key space is necessary but not sufficient for security of a cryptosystem. CIT 380: Securing Computer Systems
Cryptanalysis: Frequency Analysis Languages have different frequencies of • letters • digrams (groups of 2 letters) • trigrams (groups of 3 letters) • etc. Simple substitution ciphers preserve frequency distributions. CIT 380: Securing Computer Systems
English Letter Frequencies CIT 380: Securing Computer Systems
Additional Frequency Features • Digram frequencies • Common digraphs: EN, RE, ER, NT, TH • Trigram frequencies • Common trigrams: THE, ING, THA, ENT • Vowels other than E rarely followed by another vowel. • The letter Q is followed only by U. • Many others. CIT 380: Securing Computer Systems
Countering Frequency Analysis Nulls • Insert additional symbols (numbers) which have no meaning in random places. Idiosyncratic spellings • Hacker speak: www.google.com/intl/xx-hacker Homophonic substitution • Each letter has multiple substitutions. These techniques increase difficulty of frequency analysis but don’t make it impossible. CIT 380: Securing Computer Systems
Countering Frequency Analysis Primary weakness of simple substition: • Each ciphertext letter corresponds to only one letter of plaintext. Solution: polyalphabetic substitution • Use multiple cipher alphabets. • Switch between cipher alphabets from character to character in the plaintext. CIT 380: Securing Computer Systems
Letter Frequency Distributions CIT 380: Securing Computer Systems
Vigènere Cipher Use phrase instead of letter as key. Example: • Message THE BOY HAS THE BALL • Key VIG • Encipher using Cæsar cipher for each letter: key VIGVIGVIGVIGVIGV plain THEBOYHASTHEBALL cipher OPKWWECIYOPKWIRG Key space size is 26m. CIT 380: Securing Computer Systems
G I V A G I V B H J W E L M Z H N P C L R T G O U W J S Y A N T Z B O Y E H T Tableau shown has relevant rows, columns only. Example encipherments: key V, letter T: follow V column down to T row (giving “O”) Key I, letter H: follow I column down to H row (giving “P”) Relevant Parts of Tableau CIT 380: Securing Computer Systems
Useful Terms period: length of key • In earlier example, period is 3 tableau: table used to encipher and decipher • Vigènere cipher has key letters on top, plaintext letters on the left. CIT 380: Securing Computer Systems
Simple Attacks • Chosen Plaintext • Choose plaintext of all a’s. • If long enough, it will be encrypted to the key. • Dictionary Attack • Guess key from dictionary and try decryption. • Brute Force • Try every possible key in turn. • Is there a ciphertext only attack that’s faster? CIT 380: Securing Computer Systems
Vigènere Cryptanalysis • Find key length (period). • Break message into n parts, each part being enciphered using the same key letter. • Use frequency analysis to solve resulting simple substition ciphers. key VIGVIGVIGVIGVIGV plain THEBOYHASTHEBALL cipher OPKWWECIYOPKWIRG CIT 380: Securing Computer Systems
Kaskski Test • Conjunction of key repetition with repeated portion of plaintext produces repeated ciphertext. • Example: key VIGVIGVIGVIGVIGV plain THEBOYHASTHEBALL cipher OPKWWECIYOPKWIRG Key and plaintext line up over the repetitions. • Distance between reptitions is 9 • Repeated phrase “OPK” at 1st and 10th positions. • Period is a multiple of 9 (1, 3 or 9.) CIT 380: Securing Computer Systems
Example Vigènere Ciphertext ADQYS MIUSB OXKKT MIBHK IZOOO EQOOG IFBAG KAUMF VVTAA CIDTW MOCIO EQOOG BMBFV ZGGWP CIEKQ HSNEW VECNE DLAAV RWKXS VNSVP HCEUT QOIOF MEGJS WTPCH AJMOC HIUIX CIT 380: Securing Computer Systems
Repetitions in Example CIT 380: Securing Computer Systems
Estimate of Period • OEQOOG is probably not a coincidence • Two character repetitions may be chance. • Period may be 1, 2, 3, 5, 6, 10, 15, or 30 • Most others (7/10) have 2 in their factors • Almost as many (6/10) have 3 in their factors. • Begin with period of 2 3 = 6. CIT 380: Securing Computer Systems
Letter Coincidence • Coincidence: Picking two letters at random from a message that are identical. • Probability of picking two a’s • Let there be n letters in the ciphertext. • Let there be na a’s in the ciphertext. • The probability of selecting two a’s at random CIT 380: Securing Computer Systems
Index of Coincidence Probability of chosing two identical letters Coincidence probabilities for two letters: • English plaintext: 0.0667 • Random English letters: 1/26 @ 0.0385 CIT 380: Securing Computer Systems
English Letter Frequencies CIT 380: Securing Computer Systems
Coincidence Counting Plaintext Plaintext/Ciphertext Simple Language: f(A)=0.75, f(B)=0.25 Simple Cipher: Swap A’s and B’s CIT 380: Securing Computer Systems
Index of Coincidence Shorter Key Longer Key 0.0667 Friedman Test Expected IC • Random: 0.0385 • Plaintext: 0.0667 Expected IC by period • 2: 0.052 • 3: 0.047 • 4: 0.045 • 5: 0.044 • 10: 0.041 0.0385 CIT 380: Securing Computer Systems
Compute I.C. for Example For our ciphertext, IC = 0.043 • Indicates a key of slightly more than 5. • A statistical measure, so it can be in error, but it agrees with the previous estimate (6). If the key has m characters, then every mth character is enciphered with the same shift. • The string of letters won’t be recognizable. • But its letter frequencies should be the same as English as it’s a monoalphabetic ciphertext. CIT 380: Securing Computer Systems
Splitting Into Alphabets Alphabet IC AIKHOIATTOBGEEERNEOSAI 0.069 DUKKEFUAWEMGKWDWSUFWJU 0.078 QSTIQBMAMQBWQVLKVTMTMI 0.078 YBMZOAFCOOFPHEAXPQEPOX 0.056 SOIOOGVICOVCSVASHOGCC 0.124 MXBOGKVDIGZINNVVCIJHH 0.043 Divide cipher into 6 (period) alphabets. IC indicates single alphabet, except #4 and #6. CIT 380: Securing Computer Systems
Frequency Examination ABCDEFGHIJKLMNOPQRSTUVWXYZ 1 31004011301001300112000000 2 10022210013010000010404000 3 12000000201140004013021000 4 21102201000010431000000211 5 10500021200000500030020000 • 01110022311012100000030101 HMMMHMMHHMMMMHHMLHHHMLLLLL Unshifted frequencies (H high, M medium, L low) CIT 380: Securing Computer Systems
Begin Decryption • First matches characteristics of unshifted alphabet • Third matches if I shifted to A • Sixth matches if V shifted to A • Substitute into ciphertext (bold are substitutions) ADIYS RIUKB OCKKL MIGHK AZOTO EIOOL IFTAG PAUEF VATAS CIITW EOCNO EIOOL BMTFV EGGOP CNEKI HSSEW NECSE DDAAA RWCXS ANSNP HHEUL QONOF EEGOS WLPCM AJEOC MIUAX CIT 380: Securing Computer Systems
Look For Clues AJE in last line suggests “are”, meaning second alphabet maps A into S: ALIYS RICKB OCKSL MIGHS AZOTO MIOOL INTAG PACEF VATIS CIITE EOCNO MIOOL BUTFV EGOOP CNESI HSSEE NECSE LDAAA RECXS ANANP HHECL QONON EEGOS ELPCM AREOC MICAX CIT 380: Securing Computer Systems
Next Alphabet MICAX in last line suggests “mical” (a common ending for an adjective), meaning fourth alphabet maps O into A: ALIMS RICKP OCKSL AIGHS ANOTO MICOL INTOG PACET VATIS QIITE ECCNO MICOL BUTTV EGOOD CNESI VSSEE NSCSE LDOAA RECLS ANAND HHECL EONON ESGOS ELDCM ARECC MICAL CIT 380: Securing Computer Systems
Got It! QI means that U maps into I, as Q is always followed by U: ALIME RICKP ACKSL AUGHS ANATO MICAL INTOS PACET HATIS QUITE ECONO MICAL BUTTH EGOOD ONESI VESEE NSOSE LDOMA RECLE ANAND THECL EANON ESSOS ELDOM ARECO MICAL CIT 380: Securing Computer Systems
Countering Frequency Analaysis • Observation: If Vigènere key is very long, frequency analysis won’t work. • Problem: Long keys are hard to remember. • Solution: Use multiple encryptions. • Encrypting with a key m and key n is same as encryption by key whose length is least common multiple of m and n. • If m and n are relatively prime, then the least common multiple is mn. CIT 380: Securing Computer Systems
Rotor Machines Use multiple rounds of Vigènere substitution. • Machine contains multiple cylinders. • Each cylinder has 26 states (ciphers). • Cylinders rotate to change states on different schedules. • m-cylinder machine has 26m substitution ciphers. CIT 380: Securing Computer Systems
Enigma Machine • 3 rotors: 17576 substitutions. • 3 rotors can be used in any order: 6 combinations. • Plug board: 6 pairs of letters can be swapped. • Total keys ~ 1016 CIT 380: Securing Computer Systems