1 / 14

Fast Truncated Multiplication for Cryptographic Applications

Fast Truncated Multiplication for Cryptographic Applications. 2002−2006 Laszlo Hars ( Laszlo@Hars.US ) Seagate Research. Outline. History of the paper, Applications Examples Truncated Products Time complexity Carry Half products LS and MS products Middle-third products Squaring.

eden-best
Download Presentation

Fast Truncated Multiplication for Cryptographic Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fast Truncated Multiplication for Cryptographic Applications 2002−2006 Laszlo Hars (Laszlo@Hars.US) Seagate Research

  2. Outline • History of the paper, Applications • Examples • Truncated Products • Time complexity • Carry • Half products • LS and MS products • Middle-third products • Squaring

  3. History, Applications • Written in 2002/03 • ’03 Missed deadline • ’04 Reviewers failed to read • ’05 Page and time limitations  ½ of accepted paper printed • Applications: http://www.hars.us/Papers/TrunApps.pdf

  4. Example: Reciprocal • ⌊d2n/x⌋ = Integer reciprocal of n-digit x • Newton Iteration doubles #accurate bitsrr∙(2−r∙x)rr + r∙(1+r∙(−x)) • Proof: rk = 1/x∙(1−ε) rk+1= 1/x∙(1−ε2). • r∙x = 1−ε, only need digits2k+1 … 2k+1 of r∙(-x) • r2k = rk || rk (rk -x(2k+1)) 0.999... 2’s complement concatenate Middle third of |3∙2k| product MS half of |2∙2k| product

  5. Numerical Example: Reciprocal • x = 87654321, ⌊1016/x⌋ = 114084507 • r = 11408, -x = 108−x = 12345679 (complement) • r·(-x) = 140839506032,y = r-x = 3951 • z = r · y = 45073008, r⋉y = 4507 • r’ = r ||r⋉y = 11408 4507

  6. Examples: modular multiplication • Barrett multiplication: with µ = ⌊d2n/m⌋ab mod m = ab − ⌊ab/m⌋m =LS(ab) − (MS(ab)µ)m • With b constant, β:= MS2n(b/m)ab mod m = (aβ)m • Montgomery multiplication, -m-1:=  inv of -m mod dnabd−nmodm =MS(ab)−(LS(ab)(-m-1))m • With b constant, β:= b (-m-1) abd−n mod m = ab−(aβ)m

  7. Truncated Product • Specialized algorithms • Cover with polygons of black-box algorithms • Ignore extra digits • Subtract overlap • Pad input for excess area contiguous subsequence of the digits of the product

  8. Time complexity • Number of digit-multiplications • × is more expensive than +, −, <, load/store… • Can be performed parallel to others • Fast multiplication algorithms take ≈ nα time • Speed relations: M1/M2 ≈ T1/T2 (Mult, TrctMult) • No more auxiliary digit operations than at the corresponding black box multiplication!

  9. Carry • Omitted LS product-digits may cause carry • Some algorithms tolerate (Barrett, Newton iteration) • Others must be accurate • Maximal potential carry: at the main diagonal(n−1)dn+1 +(d−n−1)dn+1 • Last 2 digits can be “very” wrong • Carry can propagate to the first digit (9→0, x→ x+1) • Use 2 extra guard digits to the right • Almost always they absorb carry • If they are large (might not absorb)⇒ full product

  10. Half Product • MS or LS half product • Same speed ± linear term • Find optimal β, Speedup

  11. LS and MS products MS products faster calculated than the full product

  12. Middle-third product • Center Square + 2 small triangles • Karatsuba: direct recursion • 4 overlapping smaller cases • 3 are enough

  13. Squaring • Squaring short operands twice faster than mult • Complexity recursions end at short operands • Speed relations of short square/mult is (almost) the same as at long ops Squaring ∉ Truncated Products

  14. Conclusion • Fast truncated multiplication algorithms • Black-box covering • Optimal configurations • Specialized algorithms • Speed up many crypto algorithms • Constant factor (≈ 20…50% typical) • Encourage use of sub-quadratic algorithms • No speedup for FFT-based algorithms?

More Related