450 likes | 539 Views
Manipulating Lossless Video in the Compressed Domain William Thies 1 , Steven Hall 2 , Saman Amarasinghe 2 1 Microsoft Research India 2 Massachusetts Institute of Technology. ACM Multimedia October 20, 2009. Processing in the Compressed Domain. Multimedia archives are growing rapidly
E N D
Manipulating Lossless Videoin the Compressed DomainWilliam Thies1, Steven Hall2, Saman Amarasinghe21 Microsoft Research India2 Massachusetts Institute of Technology ACM Multimedia October 20, 2009
Processing in the Compressed Domain • Multimedia archives are growing rapidly • Monsters vs. Aliens production 100 TB • Facebook photos 400 TB • YouTube 600 TB • How to analyze or modify the data? lossless prior to distribution Compressed Output Uncompress Recompress Compressed Input Process Typical practice Compressed Output Compressed Input Process Compressed-domain transformation
Prior Work: Focus on Lossy Formats • DCT-based spatial compression (JPEG, MPEG stills) • Resizing [Dugad & Ahuja 2001] [Mukherjee & Mitra 2002] • Edge detection [Shen & Sethi 1996] • Image segmentation [Feng & Jiang 2003] • Shearing and rotating inner blocks [Shen & Sethi 1998] • Linear combinations of pixels [Smith & Rowe 1996] • DCT-based temporal compression (MPEG video) • Captioning [Nang, Kwon, & Hong 2000] • Reversal [Vasudev 1998] • Distortion detection [Dorai, Ratha, & Bolle 2000] • Transcoding[Acharya & Smith 1998] • Almost no work on lossless formats • Transpose and rotation of black/white images [Shoji 1995; Misra et al. 1999] • Pattern matching in compressed text [Farach & Thorup 1998; Navarro 2003] • Modifying pitch and playback of audio [Levine 1998]
Prior Work: Focus on Lossy Formats • DCT-based spatial compression (JPEG, MPEG stills) • Resizing [Dugad & Ahuja 2001] [Mukherjee & Mitra 2002] • Edge detection [Shen & Sethi 1996] • Image segmentation [Feng & Jiang 2003] • Shearing and rotating inner blocks [Shen & Sethi 1998] • Linear combinations of pixels [Smith & Rowe 1996] • DCT-based temporal compression (MPEG video) • Captioning [Nang, Kwon, & Hong 2000] • Reversal [Vasudev 1998] • Distortion detection [Dorai, Ratha, & Bolle 2000] • Transcoding[Acharya & Smith 1998] • Almost no work on lossless formats • Transpose and rotation of black/white images [Shoji 1995; Misra et al. 1999] • Pattern matching in compressed text [Farach & Thorup 1998; Navarro 2003] • Modifying pitch and playback of audio [Levine 1998] Our Focus: Regular Processing of LZ77-Compressed Data Streams
Example Input: O O O O L A L A L A to lowercase Output: o o o o l a l a l a
Example Input: O O O O L A L A L A Compressed Input: O O O O L A L A L A L A L A L A Output: o o o o l a l a l a
Example Input: O O O O L A L A L A Compressed Input: 4 2 O O O O L A L A L A Output: o o o o l a l a l a
Example Input: O O O O L A L A L A Compressed Input: 2 4 O O O O L A Count Distance “Repeat Token” Output: o o o o l a l a l a
Example Input: O O O O L A L A L A Compressed Input: 1 2 3 4 O O O O L A Count Distance “Repeat Token” Output: o o o o l a l a l a
Example Input: O O O O L A L A L A Compressed Input: 1 2 3 4 O L A Count Distance “Repeat Token” Output: o o o o l a l a l a
Example Input: O O O O L A L A L A Compressed Input: 1 2 3 4 O L A Compressed Output: 2 4 1 3 o l a Output: o o o o l a l a l a
Example Input: Compressed Domain Transformation O O O O L A L A L A Compressed Input: 1 2 3 4 O L A Compressed Output: 2 4 1 3 o l a Output: o o o o l a l a l a
Our Contributions • Handle the general case • Produce and consumemore than one data item • Split and join data streams • Implement in a compiler • Programmer thinks in terms of uncompressed data • Compiler translates to work on compressed data • Relies on StreamIt programming language • Evaluate on video processing tasks • 12 videos in Apple Animation format • Adjust colors or overlay two videos • Speedups proportional to compression ratio (median 15x)
In This Talk • StreamIt Language • Compressed Domain Transformation • Experimental Evaluation
The StreamIt Language void->void pipelineFMRadio(freq1 low, float freq2, int N) { addAtoD(); addFMDemod(); addsplitjoin { split duplicate; for (inti=0; i<N; i++) { add pipeline { addLowPassFilter(freq1 + i*(freq2-freq1)/N); addHighPassFilter(freq2 + i*(freq2-freq1)/N); } } joinroundrobin(); } add Adder(); add Speaker(); } AtoD FMDemod Duplicate LPF1 LPF2 LPF3 HPF1 HPF2 HPF3 RoundRobin Adder Speaker
The StreamIt Language • Applications • DES and Serpent [PLDI 05] • MPEG-2 [IPDPS 06] • SAR, DSP benchmarks, JPEG, … • Programmability • StreamIt Language (CC 02) • Teleport Messaging (PPOPP 05) • Programming Environment in Eclipse (P-PHEC 05) • Domain Specific Optimizations • Linear Analysis and Optimization (PLDI 03) • Optimizations for bit streaming (PLDI 05) • Linear State Space Analysis (CASES 05) • Architecture Specific Optimizations • Compiling for Communication-Exposed Architectures (ASPLOS 02 & 06, dasCMP 07) • Phased Scheduling (LCTES 03) • Cache Aware Optimization (LCTES 05) • Load-Balanced Rendering • (Graphics Hardware 05) • Migrating Legacy Code to a Stream Representation • Using a Dynamic Analysis (MICRO 07) AtoD FMDemod Duplicate LPF1 LPF2 LPF3 HPF1 HPF2 HPF3 RoundRobin Adder Speaker
Language Primitives Filter Splitter Joiner • pop N push M • roundrobin(1,1) • pop 2 push 1 • roundrobin(2,2) • roundrobin(N,M) Filter Model of computation also known as cyclo-static dataflow
Example: Video Compositing Source 1 Source 2 • roundrobin(1,1) 2 MultiplyPixels 1 Output
In This Talk • StreamIt Language • Compressed Domain Transformation • Experimental Evaluation
Transforming Windows of Data Input: O O O O O O O O L L A A L L A A L L A A HyphenatePairs Output: O O O O – – O O O O – – L L A A – – L L A A – – L L – – A A
Transforming Windows of Data Input: O O O O O O O O L L A A L L A A L L A A HyphenatePairs Output: O O O O – – O O O O – – L L A A – – L L A A – – L L – – A A
Transforming Windows of Data Input: O O O O L A L A L A 3 1 4 2 Compressed Input: O L A Compressed Output: 6 3 L A – Output: O O – O O – L A – L A – L – A
Transforming Windows of Data Input: O O O O L A L A L A 3 1 4 2 Compressed Input: O L A Compressed Output: 6 3 L A – Output: O O – O O – L A – L A – L – A
Transforming Windows of Data Input: O O O O L A L A L A 3 1 4 2 Compressed Input: O L A 2 2 4 2 Coarsened,Expanded O O L A Compressed Output: 3 3 6 3 O O – L A – Output: O O – O O – L A – L A – L – A
General Case: Filters O I N D … … Filter Coarsen D’ = LCM (D, I) N’ = N – (D’ – D) O I N’ D’ … ..… Translate Filter N’’ = N’ – N % I O I N’’O/I D’O/I N’%I items … … … Filter
Splitting Streams Output: Input: 1 1 L A L A L A L A L A 1 1 4 1 Compressed Input: CompressedOutput: 8 2 L A L A L A L A L A 4 1
Splitting Streams Output: Input: 2 2 L A L A L A L A L A 2 2 Compressed Input: L A
Splitting Streams 2 2 Coarsened, Expanded Input: 4 2 CompressedOutput: 6 4 L A L A L A L A L A 2 2
Splitting and Joining: Transpose O O O O O O O O 1 4 1 4 X O O O X O O O
Splitting and Joining: Transpose O O O O O O O O 1 4 1 4 X O O O X O O O
Splitting and Joining: Transpose O O O O O O O O 1 4 1 4 X O O O X O O O
Splitting and Joining: Transpose 3 1 3 1 O O O O O 1 4 1 4 X O O O X O 1 2 1 2
Splitting and Joining: Transpose 3 1 3 1 3 1 3 1 O O O O O 1 4 4 1 4 X O X O X O 2 1 1 2 1 2 2
General Case: Joiners N1 D1 D1(W1+W2) … … N’ W1 W1 … … N2 D2 W2 … … IfD1%W1=0 and D2%W2=0 and D1/W1=D2/W2
In This Talk • StreamIt Language • Compressed Domain Transformation • Experimental Evaluation
Implementation • Implemented subset of transformations in StreamIt • User can change graph connectivity + filter functions • Supported file format: Apple Animation (part of .MOV) • Standard format for interchange of lossless video • Compression: Run-length encoding within a line + difference encoding between frames • Emit executable plugins for MEncoder and Blender • Allows integration with standard video editing workflow 1 2 1 1-to-1 joinerwith 2-to-1 filter 1 1 1-to-1 filter 1
Experimental Methodology • Evaluated on 12 videos drawn from Internet video, computer animation, and stock digital television content • Two classes of transformations: 1. Color adjustment: inverse, brightness, contrast 2. Composite transformations: alpha-under, multiply + = alpha under x =
Results: Execution Time Color Adjustment: - 2.5x to 471x (median 17x) Compositing: - 1.1x to 32x (median 6.6x) Compression factor was low (≤1.1x) for one of source videos Compression Factor Following Re-compression
Results: File Bloat Masked out areasnot re-compressed Saturated colors not re-compressed Compression Factor Following Re-compression
Opportunity: Ignoring “Dead” Data • Some pixels in composite frames do not depend on both input frames • Example: digital television mask (a low-performance case) • If two data streams are multiplied, and one of them is repeatedly zero, then the repeat can be copied to the output (regardless of the values in the other stream) • We expect this would fix performance of our outlier cases • Requires pattern matching on stream graph x =
Extension to Other File Formats • High-efficiency mappings • Flic Video • Microsoft RLE • Targa (with run-length encoding) • Medium-efficiency mappings • Open EXR • Planar RGB Re-arranges data by color or by byte • Low-efficiency mappings • ZIP • GZIP • PNG Performs Huffman coding prior to LZ77
Conclusions • New method for direct processing of lossless-encoded data streams • Relies on LZ77 compression and stream programming model • Supports operations on windows of data • Supports splitting, joining, and reordering data • Preliminary implementation in an automatic compiler • Writeprogramonuncompresseddata,runoncompresseddata • Good speedups in the context of video processing • 15x speedup (median) on color adjustment and compositing • Across 12 videos in Apple Animation format • May prove useful as more content authored in lossless formats • Scope for extending technique, finding new applications
General Case: Splitters N D U … … Split V Coarsen D’=LCM(D,U+V) N’ = N – (D’ – D) N’ D’ U … ..… Translate Split V N’’=N’–N%(U+V) N’’VU+V D’V U+V N’%(U+V) items U … Split … … V