430 likes | 694 Views
Secret Codes, CD players, and Missions to Mars: An Introduction to Cryptology and Coding Theory. Sarah Spence Adams Associate Professor of Mathematics Franklin W. Olin College of Engineering. AKA. Sit back and relax (and learn something, of course) after a hard test. Cryptology. Bob. Alice.
E N D
Secret Codes, CD players, and Missions to Mars:An Introduction to Cryptology and Coding Theory Sarah Spence Adams Associate Professor of Mathematics Franklin W. Olin College of Engineering
AKA • Sit back and relax (and learn something, of course) after a hard test
Cryptology Bob Alice Eve • Cryptography • Inventing cipher systems • Cryptanalysis • Breaking cipher systems
Hidden Messages Through the Ages • 440 BC – The Histories of Herodotus • messages concealed beneath wax on wooden tablets • tattoos on a slave's head concealed by regrown hair • WWII – microdots • Modern day – hidden information within digitial pictures
The Scytale of Ancient Greece Used by the Spartan military 5th Century B.C.
Caesar’s Substitution Cipher 1st Century B.C. = 3 Example • Plaintext: OLINCOLLEGE • Encryption: Shift forward by 3 • Ciphertext: ROLQFROOHJH • Decryption: Shift backwards by 3 (or forwards by 23)
Modular Arithmetic • 15 mod 12 = 3 • 27 mod 26 = 1 • 90 mod 10 = 0 • a mod m is the remainder obtained upon dividing a by m • Can be obtained by subtracting off multiples of m from a until the result is between 0 and m-1
Caesar Using Modular Arithmetic • Map A,B,C,…Z to 0,1,2,…,25 • View cipher as function e(pi) = pi + k (mod 26) • Example with key = k = 15 • Plaintext: OLINCOLLEGE=14,11,8,13,2,14,11,11,4,6,4 • Encryption: e(pi) = pi + 15 (mod 26) • Ciphertext: 3,0,23,2,17,3,0,0,19,21,19 = DAXCRDAATVT • Decryption: d(ci) = ci -15 (mod 26) or d(ci) = ci +11 (mod 26) p
Kerckhoff’s Principle Must assume the enemy knows the system Also known as Shannon’s Maxim
Cryptanalysis of Substitution Ciphers • Caesar’s 26 shifts to test • Generally 26! permutations to test • If parsed, guess at common words or cribs • Frequency analysis
Frequency Analysis (Al-Kindi ~850A.D.) www.wikipedia.com
VigenèreCipher (1586) • Polyalphabetic cipher • Involves multiple Caesar shifts • Example • Plaintext: O L I N C O L L E G E • Key: S U N S U N S U N S U • Encryption: e(pi) = pi + ki (mod 26) • Decryption: d(ci) = ci - ki (mod 26)
One-Time Pads: The Ultimate Substitution Cipher Plaintext: MATHISUSEFULANDFUN Key: NGUJKAMOCTLNYBCIAZ Encryption: e(pi) = pi + ki (mod 26) Ciphertext: BGO….. Decryption: d(ci) = ci - ki (mod 26)
One-Time Pads Unconditionally secure Problem: Exchanging the key
Public-Key Cryptography Diffie & Hellman (1976) Known at GCHQ years before Uses one-way (asymmetric) functions, public keys, and private keys
Public Key Algorithms • Based on two hard problems (traditionally) • Factoring large integers (RSA) • The discrete logarithm problem (ElGamal)
3 x 10114 ciphering possibilities; polyalphabetic Destroyed frequency counts Cracked thanks to traitors, captured machines, mathematicians, and human error WWII: The Weather-Beaten Enigma
Not only do you want secrecy… …but you also want reliability! Enter coding theory…..
What is Coding Theory? • Coding theory is the study of error-control codes, which are used to detect and correct errors that occur when data are transferred or stored
General Problem • We want to send data from one place to another over some channel • telephone lines, internet cables, fiber-optic lines, microwave radio channels, cell phone channels, etc. • BUT the data may be corrupted • hardware malfunction, atmospheric disturbances, attenuation, interference, jamming, etc.
General Solution • Introduce controlled redundancy to the message to improve the chances of recovering the original message
Introductory Example • We want to communicate YES or NO • Message 1 represents YES • Message 0 represents NO If I send my message and there is an error, then you will decode incorrectly
Repetition Code of Length 2 • Encode message 1 as codeword 11 • Encode message 0 as codeword 00 • If one error occurs, you can detect that something went wrong
Repetition Code of Length 5 • Encode message 1 as codeword 11111 • Encode message 0 as codeword 00000 • If up to two errors occur, you can correct them using a majority vote
Evaluating and Comparing Codes • Important Code Parameters • Minimum distance • Determines error-control capability • Code rate • Ratio of information bits to codeword bits • Measure of efficiency • Length of codewords • Number of codewords
Complicated Problem • Want • Large minimum distance for reliability • Large number of codewords • High rate for efficiency • Conflicting goals • Require trade-offs • Inspire more sophisticated mathematical solutions
The ISBN-10 Code • x1 x2…x10 • x10 is a check digit chosen so that S=x1 + 2x2 + … + 9x9 + 10x10 =0 mod 11 • If check digit should be 10, use X instead • Can detect all single and all transposition errors
ISBN-10 Example • Cryptology by Thomas Barr: 0-13-088976-? • Want 1(0) + 2(1) + 3(3) + 4(0) + 5(8) + 6(8) + 7(9) + 8(7) + 9(6) + 10(?) = multiple of 11 • Compute 1(0) + 2(1) + 3(3) + 4(0) + 5(8) + 6(8) + 7(9) + 8(7) + 9(6) = 272 • Ponder 272 + 10(?) = multiple of 11 • The check digit must be 8
New ISBN-13 x1 x2…x13 x13 = 10-((1x1 + 3x2 + … + 1x11 + 3x12) mod 10) If check digit should be 10, use 0 instead Convert ISBN-10 to ISBN-13 using 978 prefix and new check digit
Universal Product Code (UPC) • First number: Type of product • 0, 6, 7: Standard • 2: Random-weight items (fruits, meats, etc) • 4: In-store • 5: Coupons • Next chunk: Manufacturer identification • Next chunk: Item identification • Last number: Check digit
UPC Check Digit • x1 x2…x12 • x12 is a check digit chosen so that S = 3x1 + 1x2 + … + 3x11 + 1x12 =0 mod 10 • Can detect all single and most transposition errors • Which transposition errors go undetected?
Hadamard Code Used in NASA Mariner Missions Pictures divided into pixels Pixels are assigned a level a darkness on a scale of 0 to 63
Binary Representation of Messages • Express 64 levels of darkness (our messages) using binary strings • 0 000000 • 1 000001 • 2 000010 • 3 000011 • …. • 63 111111
Hadamard Matrices HHT = n I Map length 6 messages to length 32 codewords obtained from rows of Hadamard matrices Hadamard matrices (Sylvester, 1867) have special properties that give these codewords a minimum distance of 16
Compare with Length 5 Repetition Code • Send a message of length 6 • Probability of bit error p= .01 • Length 5 Repetition Code • Sending 6 info bits requires 30 coded bits: Rate = 6/30 • P(decode incorrectly) = .00006 • Hadamard Code • Sending 6 info bits requires 32 coded bits: Rate = 6/32 • P(decode incorrectly) = .0000000008
Voyager Missions (1980’s-90’s) • Reed-Solomon codes use abstract algebra to get even better results • Ideals of polynomial rings (Dedekind,1876) • Same codes used to protect CDs from scratches!
Summary • Cryptology and coding theory are all around us • From Caesar to RSA…. from Repetition to Reed-Solomon Codes…. • More sophisticated mathematics better ciphers/codes • New uses for old mathematics, motivation for new mathematics • Cryptology has existed for thousands of years… what ciphers and codes will be next?
Jefferson Disk - Bazeries Cylinder • Used by US Army from early 1920’s to early 1940’s • Alice and Bob agree on order of disks – the key • Bob • spells out the ciphertext on wheel • looks around the rows until he sees the (coherent) message • Alice • rotates wheels to spell message in one row • sends any other row of text
The Code Talkers • Refers primarily to Navajo speakers in WWII • Only unbroken wartime cipher • Fast and efficient • Interesting connections between cracking secret ciphers and cracking ancient languages
Major Questions • What are the ideal trade-offs between rate, error-correcting capability, and number of codewords? • What is the biggest distance you can get given a fixed rate or fixed number of codewords? • What is the best rate you can get given a fixed distance or fixed number of codewords?
A Parity Check Code • Suppose we want to send 4 messages 00, 01, 10, 11 • Form codewords by appending a parity check bit to the end of each message • 00 000 • 01 011 • 10 101 • 11 110 • Compare with length 2 repetition code • Both detect all single errors using minimum distance 2 • Now rate 2/3 compared with rate 1/2 • Now 4 codewords compared with 2 codewords