Watermarking with Side Information

Watermarking with Side Information Information Technologies for IPR Protection

Outline • Information-theoretic basis • Channels with side information • Writing on dirty paper • Dirty paper codes

H(X,Y) H(Y) H(X) Basic Definitions H(X|Y) H(Y|X) I(X;Y)

Communication Rates • Code rates: a property of a code • Achievable rate: a property of a communication channel • Due to the noise in the channel, even the best code will have some probability of errors. • A rate is achievable if we can obtain arbitrarily low probability of errors by using codes with arbitrarily long symbol sequences and correspondingly large sets of messages. • |M|: the number of distinct messages that • can be encoded with sequences of • length L • R: the number of bits of information encoded • in each symbol

Channel Capacity • The capacity of a channel is defined as the maximum achievable rate for that channel, or more precisely, the supremum of all achievable rates. • Capacity of arbitrary channels • Any channel can be characterized by a conditional distribution Py|x(y), which gives the probability of receiving y when x is transmitted • The capacity of a channel is

n x y Capacity of AWGN Channels • x: a transmitted symbol, limited by a power constraint • n: the additive noise, drawn from a Gaussian distribution with variance • The capacity of such channel is

Viewing the Capacity of AWGN Channels Geometrically • Each code word of length L can be viewed as a point in an L-dimensional space. • The power constraint restricts all code words to lie within a sphere, centered at zero. • y lies within a sphere around x. //p365 Figure A.1

Writing on Dirty Paper • Imagine that we have a piece of paper covered with independent dirt spots of normally distributed intensity, and we write a message on it with a limited amount of ink. • The dirty paper with the message on it, is then sent to someone else, and acquire more normally distributed dirt. • If the recipient cannot distinguish between the ink and the dirt, how much information can we reliably send?

First noise source Second noise source s n Transmitter Receiver m mn u x y Dirty Paper Channels • The channel has two independent white Gaussian noise sources, generating s and n. • Before the transmitter chooses a signal to send, x, it is told the exact noise s that will be added to the signal by the first source. • The signal x is limited by a power constraint. The first noise source has no effect on the channel capacity!

A Dirty-paper Code for a Simple Channel • The first noise is restricted to produce only one of two specific noise vectors: s=s1 or s=s2. • Assume that s=s1, the transmitter must choose a vector x, limited by the power constraint p, to encode a message. • Since s1 is known, it can choose a desired value of the sum u=x+s1 instead, and then sets x=u-s1. u must lie within a ball of (LP)1/2 centered at s1. • The problem of choosing u is essentially the same as that of choosing a signal to transmit over a AWGN channel. //P137 Figure 5.14

A Dirty Paper Code for A Simple Channel • …. • For the case the first noise source generates s2, the transmitter choose a u from within a radius (LP)1/2 ball centered at s2. • If s1 and s2 are sufficiently different from each other, the transmitter can simply use two different sets of code vectors. • The transmitter decides on a value of u based on the message m, and the vector will be added by s. It than transmit x=u-s.

A Dirty Paper Code for A Simple Channel • The receiver receives y=x+s+n=u+n, and it will search for the code vector closest to y, and output the associated message. • The resulting message is reliable.And the capacity is unaffected by the first noise.

Dirty Paper Codes for More Complex Channels (I) • We would like to generate a set of code vectors such that, whatever the value of s, we will always find at least one code for each message within a distance of (LP)1/2 around it. • Finding a set U of vectors and dividing them into subsets (cosets) Um corresponding to possible messages m. • All codes in any given coset represent the same message, and they are widely distributed so that one of them will be probably close to s, whatever value s may take.

Dirty Paper Codes for More Complex Channels (I) • Given a message m and a known noise vector s, the transmitter finds the member u of Um that is closest to s.It then transmit x=u-s. • The receiver finds the member of U that is closest to the received vector y, and then identifies the received message by determining which coset contains the code vector.

Dirty Paper Codes for More Complex Channels (II) • To achieve full capacity, exactly only one code word for each message within the ball around every given value of s is desired. • We must ensure that there is always a code word for message G, that is, when one G code word is on the edge of the ball, another is always inside it.

Now we transmit x=a(u-s) where 0<a<1. • After adding s, the result is u’=au+(1-a)s, which is a weighted average of u and s. • The power constraint is now relaxed to (LP)1/2/a. • Trade-off: the closer a is to 1, the less noise is added to u, but the more strict the power constraint during search is. The closer a is to 0, the greater the noise, but the more relax the power constraint is.

Similarity between Dirty Paper Channel and Watermarking System • The first noise (dirty paper) s plays the role of the original signal. • The transmitted signal x is the added pattern. • The power constraint p expresses the fidelity constraint. • The second noise n is the distortion occurred due to normal processing or malicious tampering. Original signal Noise source co n Encoder Decoder m m’ cw w c’w If a watermarking system behaves enough like the dirty-paper channel, its maximum data payload should not depend on the cover media.

Difference between Dirty Paper Channel and Watermarking System • The original signal and subsequent noise are rarely Gaussian. • The fidelity constraint represented as a power constraint, implying that the distortion is measured by the MSE distance measure, is a poor predictor of perceptual distance.

s m Transmitter Py|x,s(y) Receiver mn y x General Form of Channels with Side Information • For each use of the channel, t, a random value s[t] is drawn from a set of possible values, S. • Before the transmitter choose a codeword to transmit, it is told what all values of s[t] will be during the entire transmission. • The transmitter then choose a sequence of L values, x, and transmits them. • The sequence received by the receiver, y[t], is a random value dependent on both x[t] and s[t]. The dependency is described by a conditional probability distribution, Py|x=x[t],s=s[t](y)

Capacity of Channels with Side Information • s: the value of the side information known to the transmitter. • u: the auxiliary variable. We can regard it as an element of a code word, u, chosen from a set U. It depends on the

Watermarking with Side Information