340 likes | 566 Views
A Reconfigurable System on Chip Implementation for Elliptic Curve Cryptography over GF(2 n ). Michael Jung 1 , M. Ernst 1 , F. Madlener 1 , S. Huss 1 , R. Blümel 2. 1 Integrated Circuits and Systems Lab Computer Science Department Darmstadt University of Technology Germany.
E N D
A Reconfigurable System on Chip Implementation for Elliptic Curve Cryptography over GF(2n) Michael Jung1, M. Ernst1, F. Madlener1, S. Huss1, R. Blümel2 1 Integrated Circuits and Systems LabComputer Science DepartmentDarmstadt University of TechnologyGermany 2 cv cryptovision GmbHGelsenkirchenGermany
ECC over GF(2n) • GF(2n) operations implemented in HW • Square, Multiply, Add • Inversion with Fermat‘s little Theorem • Polynomial Base • Multiplier based on a novel, generalized version of the Karatsuba Ofman Algorithm1 • EC level algorithms implemented in SW • 2P Algorithm2for EC point multiplication • Projective Point Coordinates • Algorithmic flexibility combined with performance [1] Karatsuba and Ofman, Multiplication of multidigit numbers on automata, 1963[2] Lopez/Dahab, Fast multiplication on elliptic curves over GF(2m) without precomputations, 1999
8-Bit RISC Microcontroller • 20+ MIPS @ 25 MHz Atmel FPSLIC Reconfigurable System on Chip
FPGA • 40k Gate Equivalents • Distributed FreeRAMTM Cells Atmel FPSLIC Reconfigurable System on Chip • 8-Bit RISC Microcontroller
Dual Ported • Simultaneously accessible by MCU and FPGA Atmel FPSLIC Reconfigurable System on Chip • 8-Bit RISC Microcontroller • FPGA • 36 kByte RAM
Peripherals • UARTs • Timers • etc. Atmel FPSLIC Reconfigurable System on Chip • 8-Bit RISC Microcontroller • FPGA • 36 kByte RAM
Atmel FPSLIC Reconfigurable System on Chip • 8-Bit RISC Microcontroller • FPGA • 36 kByte RAM • Peripherals • Low cost, low complexity device
with 1 1 01 0 0 A*B Polynomial Karatsuba Multiplication
with , and Multi-Segment Karatsuba (MSK) • Generalization of the Karatsuba Multiplication Algorithm • Polynomials are split into an arbitrary number of segments
2 12 2 1 012 12 01 01 1 0 0 A*B MSKk for k=3 with
2 12 2 1 012 12 01 01 1 0 0 A*B Reordering of Partial Products • Pattern Grouping • Reorder pattern by decreasing xn • Minimize differences in the set of indices of adjacent pattern
2 2 2 Pattern Grouping 1.) Grouping the partial products to one of three possible pattern 2 12 2 1 012 12 01 01 1 0 0 A*B
2 12 12 12 Pattern Grouping 1.) Grouping the partial products to one of three possible pattern 12 1 012 12 01 01 1 0 0 A*B
2 12 1 1 1 Pattern Grouping 1.) Grouping the partial products to one of three possible pattern 1 012 01 01 1 0 0 A*B
2 12 012 012 Pattern Grouping 1.) Grouping the partial products to one of three possible pattern 1 012 01 01 0 0 A*B
2 01 12 01 01 Pattern Grouping 1.) Grouping the partial products to one of three possible pattern 1 012 01 01 0 0 A*B
2 0 01 12 0 0 Pattern Grouping 1.) Grouping the partial products to one of three possible pattern 1 012 0 0 A*B
2 0 01 12 Reordering of Partial Products • Pattern Grouping • Reorder pattern by decreasing xn • Minimize differences in the set of indices of adjacent pattern 1 012 A*B
2 0 01 12 012 Reordering of Partial Products 2.) Ordering of the pattern in a top-left to bottom-right fashion 1 012 A*B
2 0 01 12 Reordering of Partial Products 2.) Ordering of the pattern in a top-left to bottom-right fashion 012 1 A*B
2 0 01 12 Reordering of Partial Products • Pattern Grouping • Reorder pattern by decreasing xn • Minimize differences in the set of indices of adjacent pattern 012 1 A*B
2 0 01 12 01 Reordering of Partial Products 3.) Ensuring that the set of added up segments in the partial products differs only by one element between two adjacent patterns 012 1 A*B
2 0 01 12 Reordering of Partial Products 3.) Ensuring that the set of added up segments in the partial products differs only by one element between two adjacent patterns 012 1 A*B
A2 A1 A0 Multiplication Sequence B2 B1 B0 Mult
A2 A1 A0 Multiplication Sequence B2 B1 B0 Mult Clock Cycle: 1
A2 A1 A0 Multiplication Sequence B2 B1 B0 2 2 Mult Clock Cycle: 2
2 A2 A1 A0 Multiplication Sequence B2 B1 B0 12 12 Mult Clock Cycle: 3
2 12 A2 A1 A0 Multiplication Sequence B2 B1 B0 012 012 Mult Clock Cycle: 4
2 12 A2 A1 A0 Multiplication Sequence B2 B1 B0 01 01 Mult 012 Clock Cycle: 5
2 01 12 A2 A1 A0 Multiplication Sequence B2 B1 B0 1 1 Mult 012 Clock Cycle: 6
2 01 12 A2 A1 A0 Multiplication Sequence B2 B1 B0 0 0 Mult 012 1 Clock Cycle: 7
2 0 01 12 A2 A1 A0 Multiplication Sequence B2 B1 B0 Mult 012 1 Clock Cycle: 8
Op. A Op. B FF-ALU Coprocessor Interface AVR EC level Algorithms FPGA Finite Field Arithmetic MULT ADD SQUARE RAM Controller ...
Results • Prototype Implementation: • 5-Segment Karatsuba (MSK5) • 23-Bit combinational Multiplier • GF(2113) • One EC point multiplication takes 10.9 ms @ 12 MHz • Speed-up factor of about 40 compared to an assembler optimized software implementation