330 likes | 482 Views
Authentication Methods: From Digital Signatures to Hashes. Lecture Motivation. We have looked at confidentiality services, and also examined the information theoretic framework for security.
E N D
Lecture Motivation • We have looked at confidentiality services, and also examined the information theoretic framework for security. • Confidentiality between Alice and Bob only guarantees that Eve cannot read the message, it does not address: • Is Alice really talking to Bob? • Is Bob really talking to Alice? • In this lecture, we will look at the following problems: • Entity Authentication: Proof of the identity of an individual • Message Authentication: (Data origin authentication) Proof that the source of information really is what it claims to be • Message Signing: Binding information to a particular entity • Data Integrity: Ensuring that information has not been altered by unknown entities
Lecture Outline • Discrete Logarithms and ElGamal • Primitive elements and some more number theory (quickly) • DLOG • ElGamal, another Public Key Algorithm… • Digital Signatures: • The basic idea • RSA Signatures and ElGamal Signatures • Inefficiencies: Hashing and Signing • Hash Functions: • Definitions and terminology • CHP Hash • SHA-1 • Message Authentication Codes Note: Some attacks will be discussed. More attacks and cryptanalysis will come later in the semester
Primitive Roots • Consider the following powers of 3 (mod 7): Note that we obtain all non-zero numbers mod 7. When this happens, we call 3 a primitive root (or generator) mod 7. • Is a number always a primitive root? No. • If p is prime there are f(p-1) primitive roots mod p. • How to find them? Good homework problem… • Proposition: Let g be a primitive root for the prime p • If n is an integer, then gn=1 (mod p) if and only if and only if n=0 (mod p-1) . • If j and k are integers, then gj=gk (mod p) if and only if j=k (mod p-1). Proof: We sketch (1) on the board.
Discrete Logarithms • Let p be a prime, and a and b nonzero integers (mod p) with • The problem of finding x is called the discrete logarithm problem, and is written: • Often a will be a primitive root mod p. • The discrete log behaves like the normal log in many ways: • Generally, finding the discrete log is a hard problem. • f(x) = ax (mod p) is an example of a one-way function.
ElGamal Public Key Cryptosystem • One way functions are often used to construct public key cryptosystems. We saw one in RSA, we now show an example using the DLOG problem. • Alice wants to send m to Bob. Bob chooses a large prime p and a primitive root a. We assume 0 < m < p. Bob also chooses a secret integer a and computes b=aa (mod p). • Bob’s Public key is: (p, a, b) • Alice does: • Chooses a secret random integer k and computes r=ak (mod p) • Computes t=bkm (mod p). • Sends (r,t) to Bob. • Bob decrypts by:
ElGamal Public Key Cryptosystem, pg. 2 • Important issues… • a must be kept secret, else Eve can decrypt • Eve sees (r,t): t is the product of two random numbers and is hence random. Knowing r does not really help as Eve would need to be able to solve DLOG in order to get k. • Very important: A different random k must be used for each message! • If we have m1 and m2, and use the same k, then the ciphertexts will be (r,t1) and (r,t2) • If Eve ever finds m1 then she has m2 also!!!
Overview of Digital Signatures • Suppose you have an electronic document (e.g. a Word file). How do you sign the document to prove to someone that it belongs to you? • You can’t use a scanned signature at the end– this is easy to forge and use elsewhere. • Conventional signing can’t work in the digital world. • We require a digital signature to satisfy: • Digital signatures can’t be separated from the message and attached to another message. • Signature needs to be verified by others.
An Application for Digital Signatures • Suppose we have two countries, A and B, that have agreed not to test any nuclear bombs (which produce seismic waves when detonated). How can A monitor B by using seismic sensors? • The sensors need to be in country B, but A needs to access them. There is a conflict here. • Country B wants to make sure that the message sent by the seismic sensor does not contain “other” data (espionage). • Country A, however, wants to make sure that the data has not been altered by country B. (Assumption: the sensor itself is tamper proof). How can we solve this problem?
Treaty Verification Example • RSA provides a solution: • Country A makes an RSA public/private key. (n,e) are given to B but (p,q,d) are kept private in the tamper-proof sensor. • Sensor collects data x and uses d to encrypt: y=xd (mod n), and sends x and y to country B. • Country B takes x and y and calculates z=ye (mod n). • If z=x, then B can be sure that the encrypted message corresponds to x. B then forwards (x,y) to A. • Country A checks that ye (mod n)=x. If so, then A is sure that x has not been modified, and A can trust x as being authentic. • In this example, it is hard for B to forge (x,y) and hence if (x,y) verifies A can be sure that data came unaltered from the sensor.
RSA Signatures • The treaty example is an example of RSA signatures. We now formalize it with Alice and Bob. • Alice publishes (n,eA) and keeps private (p,q,dA) • Alice signs m by calculating y=mdA (mod n). The pair (m,y) is the signed document. • Bob can check that Alice signed m by: • Downloading Alice’s (n,eA) from a trusted third party. Guaranteeing that he gets the right (n,eA) is another problem (we’ll talk about this in a later lecture). • Calculate z=yeA (mod n). If z=m then Bob (or anyone else) can be guaranteed that Alice signed m.
RSA Signatures, pg. 2 • Suppose Eve wants to attach Alice’s signature to another message m1. She cannot simply use (m1, y) since • Therefore, she needs y1 with y1eA=m1 (mod n). • m1 looks like a ciphertext and y1 like a plaintext. In order for Eve to make a fake y1 she needs to be able to decrypt m1 to get y1!!! She can’t due to hardness of RSA. • Existential Forgery: Eve could choose y1 first and then calculate an m1 using (n,eA) via m1=y1eA (mod n). Now (m1, y1) will look like a valid message and signature that Alice created since m1=y1eA (mod n). • Problem with existential forgery: Eve has made an m1 that has a signature, but m1 might be gibberish! • Usefulness of existential forgery depends on whether there is an underlying “language” structure.
Blind RSA Signatures • Sometimes we might want Alice to sign a document without knowing its contents (e.g. privacy concerns: purchaser does not want Bank to know what is being purchased, but wants Bank to authorize purchase). • We can accomplish this with RSA signatures (Bob wants Alice to sign a document m): • Alice generates an RSA public and private key pair. • Bob generates a random k mod n with gcd (k,n)=1. • Bob computes t=keAm (mod n), and sends t to Alice. • Alice signs t as following the normal RSA signature procedure by calculating s=tdA (mod n). Alice sends Bob s. • Bob computes k-1s (mod n). This is the signed message mdA (mod n). Verification: Does Alice learn anything about m from t?
ElGamal Signatures • We may modify the ElGamal public key procedure to become a signature scheme. • Alice wants to sign m. Alice chooses a large prime p and a primitive root a. Alice also chooses a secret integer a and computes b=aa (mod p). • Alice’s Public key is: (p, a, b). Security of the signature depends on the fact a is private. • Alice does: • Chooses a secret random integer k with gcd(k,p-1)=1, and computes r=ak (mod p) • Computes s=k-1(m-ar) (mod p). • The signed message is the triple (m,r,s).
ElGamal Signatures, pg. 2 • Bob can verify by: • Downloading Alice’s public key (p, a, b). • Computes v1=brrs (mod p) and v2=am (mod n) • The signature is valid if and only if v1=v2 (mod p) • Verification: We have Therefore • This scheme is believed to be secure, as long as DLOG is hard to solve. • Don’t: Choose a p with (p-1) the product of small primes and don’t reuse k.
Wastefulness of plain signatures • In signature schemes with appendix, where we attach the signature to the end of the document, we increase the communication overhead. • If we have a long message m=[m1,m2,…,mN], then our signed document is {[m1,m2,…,mN],[sigA(m1),…,sigA(mN)]}. • This doubles the overhead! • We don’t want to do this when communication resources are precious (which is always!). • Solution: We need to shrink the message into a smaller representation and sign that. • Enter: Hash functions
Hash Functions • Straight-forward application of digital signatures can be expensive when the message is large • In general, many security protocols benefit from using a “digested” or “compressed” representative of a message • We typically need additional cryptographic properties in order for the compression operation to be useful • This “compression function” is a hash function: Domain Range h(m)
Hash Functions, pg. 2 • Formally, a cryptographic hash function h takes an input message of arbitrary length and produces a message digest of fixed length, and satisfies: • Given a message m, h(m) is quick to calculate • One-Way (preimage resistance): Given a digest y, it is computationally infeasible to find an m with h(m)=y. • Strongly Collision Free: It is computationally infeasible to find messages m1 and m2 with h(m1)=h(m2). • Can we ever have h(m1)=h(m2)? Yes. Why? • We will look at a couple examples.
Chaum, vanHeijst, Pfitzman Hash • We may use the DLOG problem to construct a hash function • Choose a prime p such that q=(p-1)/2 is also prime. (There’s an algorithm for doing this, but that’s not our goal today). Choose two primitive roots a and b. • The hash function h(m) will take integers (mod q2) to integers (mod p). Hence, producing half the bits. • Write m=x0+x1q with . • Define the hash by:
CHP Hash is strongly collision-free • Proposition: If we know with , then we can solve the discrete logarithm . • Proof: Will be given on the board after we cover all of the slides.
SHA-1 • In order to get fast hash functions, we need to operate at the bit-level. SHA-1 is one such algorithm. • Many of the popular hash functions (e.g. MD5, SHA-1) use an iterative design: • Start with a message m of arbitrary length and break it into n-bit blocks, m=[m1,m2,…,ml]. The last block is padded to fill out a full block size. • Message blocks are processed via a sequence of rounds using a compression function h’ which combines current block and the result of the previous round • X0 is an initial value, and Xl is the message digest.
SHA-1, pg. 2 • In SHA-1, we pad according to the rule: • Start with a message m of arbitrary length and break it into n-bit blocks. • The last block is padded with a 1 followed by enough 0 bits to make the new message 64 bits short of a multiple of 512 bits in length. • Into the 64 unfilled bits of the last block, we append the 64-bit representation of the length T of the message. • Overall, we have blocks of 512 bits. • The appended message becomes m=[m1,m2,…,mL].
SHA-1, pg. 3 (Basic Operations) • We will need the following bit operations:
SHA-1, pg. 5 (Inside the Alg.) Initial 160-bit register X0=[H0,H1,H2,H3,H4]
SHA-1, pg. 6 (Subregister Operations) • The operations done by ft(b,C,D) depend on the round number t • The word Wt depends on the round number t • The constant Kt depends on the round number t
Message Authentication Codes • A message authentication code (MAC) is a function that is used to prevent alteration of messages: • MACs use a shared key K between Alice and Bob • Alice will send not only the message m, but also MACK(m). • Bob checks whether the attached MAC matches what he calculates • Eve cannot alter the message because she does not have K. • The MAC takes two inputs: the key K and an arbitrary size m. • Ideally, a MAC should be a random mapping from all possible inputs to n-bits of output. • The uncertainty (and security) of the MAC is directly associated with the size of the key K • Remember: to Eve, the message is known, so it’s the key that contains the security
CBC-MAC • CBC-MAC is a method for turning a block cipher into a MAC: • Idea: encrypt m using CBC mode and throw away all but last block of ciphertext. • For messages P1, P2, …, Pk, the MAC is calculated by • Do not use the same key for encryption (confidentiality) and authentication!
CBC-MAC, pg. 2 • Be careful when using CBC-MAC. Here’s a possible protocol failure: • Observe: Fix K. If MAC(a)=MAC(b), then MAC(a||c) =MAC(b||c), where c is a single block length in size. • Now, suppose attacker collects many MAC values and finds a collision. This gives a and b for which MAC(a)=MAC(b). • If attacker can get the sender to authenticate (a||c) (How is another matter…) then the attacker can replace the message being sent to the receiver with (b||c). Comment: Its not an easy attack to do, but it is a possible weakness!
CBC-MAC, pg. 3 • Practical Implementation Details: • Generally, if your message is m, do not just calculate MAC(m), rather you should make an intermediate message s=(l||m), where l is the length of m in a fixed-length format. • Pad s to be a multiple of block size • Apply CBC-MAC to the padded string s • Output the last ciphertext block. Do not output any intermediate block values! • CBC-MAC can reuse same code as confidentiality (encryption) functions • CBC-MAC is generally tough to use correctly, though.
HMAC • We may also use hash functions to build MACs. • We cannot simply use MACK(m)=h(K||m) or h(m||K): • Having the key at the front allows for length extension attacks • Having the key at the end allows for key-recovery attacks • Designers of HMAC considered these issues • HMAC computes Where a and b are constants that are specified. • HMAC has been around for a while and has been cryptanalyzed. It’s the preferred MAC to use.
Using MACs • We must be careful using MACs. • If Alice sends Bob [m||MACK(m)] and Eve records this, she may send it again at a later time (the replay attack!) • Generally, you want to authenticate not just the message, but the context. That is, you want to authenticate m and additional data d (such as message number, source, destination, protocol identifier, sizes for different fields, etc.) • Why all these possibilities? If you tie the message to the specific context, then it is harder for an adversary to manipulate context fields to forge. • Make certain, though, that you have clear rules on how to split concatenations (d||m) back into d and m.
Problems with Hashes • We must be careful when using hash functions, they are subject to some “attacks” • Length Extension Attack: Consider a block-based hash like SHA-1, with input blocks m=(m1, m2, …, mk), and hash h(m). A new message m’ =(m1, m2, …, mk, mk+1), will have hash h(m’)=h’(h(m),mk+1), where h’ is the compression sub-function. In systems, such as authentication applications, where we calculate h(X||m), Eve can append extra text to m and also update the hash. • Partial Message Collision Attack: Suppose we are able to find m and m’ such that h(m)=h(m’). If a system uses h(m||X) as an authentication parameter, then due to the iterative nature h(m||X)=h(m’||X). An adversary can replace m with m’ during authentication. • In general hashing practice, we really use f(m)= h(h(m)||m) or f(m)=h(h(m)) as the hash.