240 likes | 334 Views
Accusation probabilities in Tardos codes. Antonino Simone and Boris Š kori ć Eindhoven University of Technology CWG, Dec 2010. Outline. Introduction to forensic watermarking Collusion attacks Aim Attack models Tardos scheme Code length history q- ary version Properties
E N D
Accusation probabilities in Tardos codes Antonino Simone and Boris Škorić Eindhoven University of Technology CWG, Dec 2010
Outline • Introduction to forensic watermarking • Collusion attacks • Aim • Attack models • Tardos scheme • Code length history • q-ary version • Properties • New parameterization • Majority voting effect • Performance of the Tardos scheme • False accusation probability • Results & Summary
originalcontent originalcontent content withhidden payload payload WM secrets payload WM secrets Detector Embedder Forensic Watermarking ATTACK Payload = some secret code indentifying the recipient
"Coalition of pirates" = "detectable positions" pirate #1 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 0 1 0 1 0 1 0 1 1 #2 1 0 1 0 1 0 1 0 0 0 1 1 #3 1 1 1 0 0 0 1 1 0 0 0 1 #4 AttackedContent 1 0/1 1 0 0/1 0 1 0/1 0/1 0 0/1 1 Collusion attacks
Aim Trace at least one pirate from detected watermark BUT Resist large coalition longer code Low probability of innocent accusation (FP) (critical!) longer code Low probability of missing all pirates (FN) (not critical) longer code AND Limited bandwidth available for watermarking code
Attack models Once pirates detect watermark positions, what can they do? • Restricted digit model • Choice from available symbols only • Unreadable digit model • Erasure allowed • Arbitrary digit model • Arbitrary symbol (but not erasure) • General digit model Alphabet={A,B,C,D} equivalent for binary symbols • More realistic scenario • Simpler to analyze
Staddon et al 2001: Tardos 2003: Boneh and Shaw 1998: Chor et al 2000: Tardos 2003: Boneh and Shaw 1998: Code length history Construction Huang + Moulin; Amiri + Tardos 2009: c0 = #pirates n = #usersm = code length in symbols q = alphabet size 1 = Prob[accuse specific innocent] = Prob[not all accused are guilty] 2 = False Negative prob. Lower bound
q-aryTardos scheme (2008) m content segments biases Symbol biases drawn from distribution F embedded symbols • Arbitrary alphabet size q • Dirichletdistribution F • Symbol-symmetric n users c pirates Symbols allowed =y watermark after attack
Tardos scheme continued • Accusation: • Every user gets a score • User is accused if score > threshold • Sum of scores per content segment • Given that pirates have y in segment i: • Symbol-symmetric p g0(p) g1(p) p
Properties of the Tardos scheme • Asymptotically optimal • Random code book • No framing • No risk to accuse innocent users if coalition is larger than anticipated • F, g0 and g1 chosen ‘ad hoc’ (can still be improved)
Accusation probabilities m = code length c = #pirates μ̃= expected coalition score per segment Pirates want to minimize μ̃ and make longer the innocent tail threshold • Curve shapes depend on: • F, g0, g1 (fixed ‘a priori’) • Code length • # pirates • Pirate strategy guilty innocent total score (scaled) Central Limit Theorem asymptotically Gaussian shape (how fast?) 2003 2010: innocent accusation curve shape unknown… till now!
New parameterization Necessary a new parameterization! Kb=quantity depends on pirate strategy Kb can be pre-computed Which strategy minimizes μ̃? Symbol-symmetric we take care only the symbol occurrences = pirate occurrences vector α = #α in segment c pirates α α = c W(b) b
Some attack definitions • Majority voting • yi = symbol that occurs most in segment i • Interleaving attack • Prob[yi=α] = α /c Example:
Majority voting • Theorem: Majority voting strategy minimizes μ̃ • Proof (intuitive): • Case 1: • only 2 symbols detected W(b) c=19 Best choice b
Majority voting • Theorem: Majority voting strategy minimizes μ̃ • Proof (intuitive): • Case 2: • more than two symbols detected • one symbol occurs more than c/2 times W(b) c=19 Best choice b
Majority voting • Theorem: Majority voting strategy minimizes μ̃ • Proof (intuitive): • Case 3: • more than two symbols detected • all symbols occur less than c/2 times W(b) c=19 b Best choice
Innocent curve behaviour • Motivations: • Most critical part in the Tardos scheme (FP ≈ 10-10) • Still unknown • Unknown innocent curve unknown real code length • Is Gaussian approximation good?
Approach Fourier transform property: • Steps: • S = iSi • Si • = pdf of total score S • S = InverseFourier[ ] • Compute • Depends on strategy • New parameterization for attack strategy • Compute • Taylor • Taylor • Taylor Trouble doing numerics (integral does not converge)
Main result: false accusation probability curve Example: interleaving attack threshold/√m exact FP log10FP Result from Gaussian
Main result: false accusation probability curve Example: interleaving attack Conclusion: Gaussian approximation is worse for larger q Better than Gaussian!
Main result: false accusation probability curve Example: majority voting attack threshold/√m exact FP Result from Gaussian log10FP FP is 70 times less than Gaussian approx in this example But Code 2-5% shorter than predicted by Gaussian approx
Summary Results: • introduced a new parameterization of the attack strategy • majority voting minimizes μ̃ • first to compute the innocent score pdf • quantified how close FP probability is to Gaussian • sometimes better then Gaussian! • safe to use Gaussian approx • larger q Gaussian approximation less good Future work: • study more general attacks • different parameter choices Thank you for your attention!