1 / 20

Lecture5 – Introduction to Cryptography 3/ Implementation

Lecture5 – Introduction to Cryptography 3/ Implementation. Rice ELEC 528/ COMP 538 Farinaz Koushanfar Spring 2009. Rivest, Shamir, Adelman (RSA). Number theory + difficulty of determining prime factors of a large number Two keys d and e are used for encryption and decryption

lluvia
Download Presentation

Lecture5 – Introduction to Cryptography 3/ Implementation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture5 – Introduction to Cryptography 3/ Implementation Rice ELEC 528/ COMP 538 Farinaz Koushanfar Spring 2009

  2. Rivest, Shamir, Adelman (RSA) • Number theory + difficulty of determining prime factors of a large number • Two keys d and e are used for encryption and decryption • Plaintext message P is encrypted to ciphertext C C=Pe mod n • The plaintext is recovered by P=Cd mod n • Encrypt/decrypt are mutual inverses and commutative P=Cd mod n = (Pe)d mod n = (Pd)e mod n

  3. RSA – Key Choice • Starting point: select a value for n • Product of two large primes p and q – they are ~100 digits  n is ~200 bits • A relatively large e is selected that is relatively prime to (p-1)*(q-1), one easy way is to select e to be larger prime than both (p-1) and (q-1) • Finally, d is selected such that e*d= 1 mod (p-1)*(q-1)

  4. Mathematical Foundation • The Euler totient function(n)is the number of positive integers less than n relatively prime to n, if p is prime, then (p)=p-1 • If n=p*q, where p and q are both prime (n)=(p)*(q)= (p-1)*(q-1) • Euler and Fermat proved that x(n) =1 mod n For any integer x, if n and x are relatively prime

  5. Mathematical Foundation -- RSA • Encrypt by RSA: E(P)=Pe • Value of e is selected s.t. the inverse d can be easily formed (inverses mod (n)) e*d=1 mod(n) • Or, e*d=k*(n)+1 for some int k • Because of Euler/Fermat results, assuming P and p are relatively prime Pp-1=1 mod p

  6. RSA Math (Cont’d) • Since (p-1) is a factor of (n) Pk*(n)=1 mod p • Multiplying by P produces Pk*(n)+1=P mod p • The same is true for q: Pk*(n)+1=P mod q (Pe)d = Ped =Pk*(n)+1=P mod q=P mod p • Thus, (Pe)d = P mod n • And e and d are inverse operations

  7. Crypto Processors • There are many many HW implementations of the standard security protocols, e.g., AES, DES, PKP • Please check: http://www.hardware-ciphers.com/en/index.html • Our goal is not to design a new one, or to teach you to design a new one, but to show to you how implementations look • What are the basic building blocks, what are the potential weaknesses/vulnerability of each block

  8. Recommended reading • A. Hodjat, I. Verbauwhede. Minimum area cost for a 30 to 70 Gbits/s AES processor. IEEE Computer society Annual Symposium on VLSI, pp. 83- 88, 2004. • T. Good and M. Benaissa. AES on FPGA from the fastest to the smallest, 2005. • L. Batina, S. Berna Ors, B. Preneel and J. Vandewalle. Hardware architectures for public key cryptography, 2003.

  9. Minimum Area Cost for a 30 to 70 Gbits/s AES Processor Alireza Hodjat Ingrid Verbauwhede Department of Electrical Engineering University of California, Los Angeles {ahodjat, ingrid} @ ee.ucla.edu IEEE Computer Society Symposium on VLSI (ISVLSI 04) February 2004 This material is based upon work supported by the Space and Naval Warfare Systems Center - San Diego under contract No. N66001-02-1-8938.

  10. Outline • Motivation • Ultra high throughput AES implementation • Area efficient byte substitution • High speed AES with online key scheduling • High speed AES with offline key scheduling • Conclusion

  11. Motivation • Cryptographically secure random number generation for optical link switches • Advanced Encryption Standard algorithm in the Counter mode of operation • Non-feedback mode of operation (pipelining is allowed)

  12. Ultra High Throughput AES • The key length • Critical path is in the Key scheduling path • Fixed key size : only 128-bit • Loop-unrolling • Pipelining • Inner round pipelining • Outer round pipelining • Choice of byte-substitution phase • LUT implementation • Implementation using GF operations (further pipelining)

  13. Byte substitution optimization • Byte substitution on GF(28) • First: multiplicative inverse in GF(28) • Second: Affine transformation (over Gf(2)) • Multiplicative inverse in GF(28) is expensive • Area efficient implementation using GF(24) operations

  14. a : Byte substitution using LUT implementation b : Non-pipelined Sbox using GF operations c : Two-stage pipelined Sbox using GF operations d : Three-stage pipelined Sbox using GF operations Area Efficient Byte Substitution

  15. Area-Delay Trade-off for Sbox • The area cost of the Sbox using two-stage and three-stage composite field implementation is 23% and 32% less than the LUT design with the same speed

  16. 2 pipeline stages per round 3 pipeline stages per round 4 pipeline stages per round High Speed AES with Online Key Scheduling

  17. Throughput-Area Trade-off for AES • Area cost for the design with three pipeline stages is 35% less than the design with LUT Sbox implementation • Area cost for the design with four pipeline stages is 30% less than the design with LUT Sbox implementation

  18. High Speed Design with Offline Key Scheduling • Key does not vary as frequent as data • Pre-calculate the key schedule and store them in the round key registers • Key schedule is done in 20 cycles

  19. Throughput-Area Trade-Off • Offline key scheduling unit can reduce the area up to 28 %. • Area cost for the design with three pipeline stages is 37% less than the design with LUT Sbox implementation • Area cost for the design with four pipeline stages is 33% less than the design with LUT Sbox implementation

  20. Conclusion • Area efficient architectures for 30 to 70 Gbits/s AES processor • Loop unrolling and inner and outer round pipelining were used • Pipelined design of composite field implementation of the byte substitute phase reduces the area cost up to 35% • Offline key scheduling unit reduces the area cost up to 28% • Total area cost of the final architecture was reduced up to 48%

More Related