180 likes | 523 Views
Dualities in Digital Audio Compression and Digital Audio Watermarking Yi-Wen Liu , Postdoc/ Research Engineer, Boys Town National Research Hospital, Omaha, NE
E N D
Dualities in Digital Audio Compression and Digital Audio Watermarking Yi-Wen Liu, Postdoc/ Research Engineer, Boys Town National Research Hospital, Omaha, NE Keywords: masking, floating-point quantization, noise shaping, water filling, channel coding, spread spectrum, writing on dirty paper June 28, 2007
High fidelity AAC encoding at 128 kbps/stereo [Remark] 2 * 44,100 samples/sec * 16 bits = 1.41 Mbps
dz dM Masking: Presence of a tone* increases threshold in its vicinity** • 1.0 Bark ~1.3mm on basilar membrane • Spreading function resembles the envelope travelling waves. • * Tonal vs. noise masker • ** Also check: forward- and backward-masking.
+ Removing masked components: Does it achieve sufficient compression?
Psycho-acoustics The noise shaping principle: Quantization errorX(q)[k]– X[k] to be masked Window X(q)[k] X[k] Modified Discrete Cosine Transform Huffman coding Bit packing Quantization x[n] 01001011… Parameters Bit allocation
Band-wise FP quantization: Bit allocation minimizes SMR-weighted square error.
Jack and Jill went up the hill to fetch a pail of water… Rb SMR/6dB Nb • Fixed- vs. variable-rate implementation. • What if Rb is fractional? negative?
Image source: Bureau of Engraving and Printing, United States Department of the Treasury http://www.moneyfactory.com/
Applications of digital watermarks • Copyright protection • Copy protection (Philips Research, 2000) • Transaction tracking • Prohibiting upload of pirated materials (YouTube/Google, 2007) • Broadcast monitoring
Broadcast monitoring: the “portable people meter” (By Arbitron Inc., NYSE: ARB) • Programs (and commercials) are embedded with acoustical watermarks • A wearable device • Picks up the watermarks • Identifies programs • System tested in Houston Image source: http://www.arbitron.com/portable_people_meters/home.htm
signal spectrum watermark Arbitron’s technology: Pseudo-random watermarks spread below masking signal mark signal + mark Kirovski & Malvar (2003), “Spread spectrum watermarking of audio signals,” IEEE Trans. Signal Processing.
Noise is Signal and Signal is Noise. B “Attack” N W ENC + + DEC B* X Y S B, B*: Bit streamsW: WatermarkS: Original signal X: Watermarked signalN: NoiseY: Corrupted copy of X Information Capacity for discrete-timeGaussian Channel (Shannon 1948): C = ½ log2 ( 1 + SNR ), bits per sample.
2nI(U;S)sequences B W: watermark 2nC files “Attack” N ENC + + DEC B* X Y S: music Communication with random “state information” known at the encoder (Gel’fand & Pinsker, 1980) • Theorem: (Costa, 1983. “Writing on dirty paper”, Cohen & Lapidoth 2002) If N is Gaussian i.i.d. and S is ergodic, then capacity is as high as if S were also known to the decoder. C = maxp(u,w|s) {I(U;Y) - I(U;S)} = ½ log2 ( 1 + <W2>/<N2>)
= 0 = 1 Image acquired from http://cst-www.nrl.navy.mil/lattice/struk/pnma.html Moulin et al. (2005). “Data hiding codes”. Quantization Index Modulation(Chen & Wornell, 1999, 2001; Chou et al., 2001) sin s0 s1 Δs: step-size. Should be large but not too large.
We just scratched the surface… • What if attack is smarter than additive noise? • Linear scaling & filtering • Audio compression • Time warping, pitch shifting. • Collusion attack, sensitivity attack, … • In general, it’s a game between encoder and attacker(s). • Encoder’s advantage: going first; to use psychoacoustics to the fullest extent. • Encoder’s disadvantage: going first. Attacker(s) can attempt to tamper or even erase watermark.
MBI Boys Town National Research Hospital Douglas Keefe, Steve Neely Stanford University Music: Julius Smith, Marina Bosi EE: Tom Cover Acknowledgement