ACM Multimedia October 20, 2009

Manipulating Lossless Videoin the Compressed DomainWilliam Thies1, Steven Hall2, Saman Amarasinghe21 Microsoft Research India2 Massachusetts Institute of Technology ACM Multimedia October 20, 2009

Processing in the Compressed Domain • Multimedia archives are growing rapidly • Monsters vs. Aliens production 100 TB • Facebook photos 400 TB • YouTube 600 TB • How to analyze or modify the data? lossless prior to distribution Compressed Output Uncompress Recompress Compressed Input Process Typical practice Compressed Output Compressed Input Process Compressed-domain transformation

Prior Work: Focus on Lossy Formats • DCT-based spatial compression (JPEG, MPEG stills) • Resizing [Dugad & Ahuja 2001] [Mukherjee & Mitra 2002] • Edge detection [Shen & Sethi 1996] • Image segmentation [Feng & Jiang 2003] • Shearing and rotating inner blocks [Shen & Sethi 1998] • Linear combinations of pixels [Smith & Rowe 1996] • DCT-based temporal compression (MPEG video) • Captioning [Nang, Kwon, & Hong 2000] • Reversal [Vasudev 1998] • Distortion detection [Dorai, Ratha, & Bolle 2000] • Transcoding[Acharya & Smith 1998] • Almost no work on lossless formats • Transpose and rotation of black/white images [Shoji 1995; Misra et al. 1999] • Pattern matching in compressed text [Farach & Thorup 1998; Navarro 2003] • Modifying pitch and playback of audio [Levine 1998]

Prior Work: Focus on Lossy Formats • DCT-based spatial compression (JPEG, MPEG stills) • Resizing [Dugad & Ahuja 2001] [Mukherjee & Mitra 2002] • Edge detection [Shen & Sethi 1996] • Image segmentation [Feng & Jiang 2003] • Shearing and rotating inner blocks [Shen & Sethi 1998] • Linear combinations of pixels [Smith & Rowe 1996] • DCT-based temporal compression (MPEG video) • Captioning [Nang, Kwon, & Hong 2000] • Reversal [Vasudev 1998] • Distortion detection [Dorai, Ratha, & Bolle 2000] • Transcoding[Acharya & Smith 1998] • Almost no work on lossless formats • Transpose and rotation of black/white images [Shoji 1995; Misra et al. 1999] • Pattern matching in compressed text [Farach & Thorup 1998; Navarro 2003] • Modifying pitch and playback of audio [Levine 1998] Our Focus: Regular Processing of LZ77-Compressed Data Streams

Example Input: O O O O L A L A L A to lowercase Output: o o o o l a l a l a

Example Input: O O O O L A L A L A Compressed Input: O O O O L A L A L A L A L A L A Output: o o o o l a l a l a

Example Input: O O O O L A L A L A Compressed Input: 4 2 O O O O L A L A L A Output: o o o o l a l a l a

Example Input: O O O O L A L A L A Compressed Input: 2 4 O O O O L A Count Distance “Repeat Token” Output: o o o o l a l a l a

Example Input: O O O O L A L A L A Compressed Input: 1 2 3 4 O O O O L A Count Distance “Repeat Token” Output: o o o o l a l a l a

Example Input: O O O O L A L A L A Compressed Input: 1 2 3 4 O L A Count Distance “Repeat Token” Output: o o o o l a l a l a

Example Input: O O O O L A L A L A Compressed Input: 1 2 3 4 O L A Compressed Output: 2 4 1 3 o l a Output: o o o o l a l a l a

Example Input: Compressed Domain Transformation O O O O L A L A L A Compressed Input: 1 2 3 4 O L A Compressed Output: 2 4 1 3 o l a Output: o o o o l a l a l a

Example

Our Contributions • Handle the general case • Produce and consumemore than one data item • Split and join data streams • Implement in a compiler • Programmer thinks in terms of uncompressed data • Compiler translates to work on compressed data • Relies on StreamIt programming language • Evaluate on video processing tasks • 12 videos in Apple Animation format • Adjust colors or overlay two videos • Speedups proportional to compression ratio (median 15x)

In This Talk • StreamIt Language • Compressed Domain Transformation • Experimental Evaluation

The StreamIt Language void->void pipelineFMRadio(freq1 low, float freq2, int N) { addAtoD(); addFMDemod(); addsplitjoin { split duplicate; for (inti=0; i<N; i++) { add pipeline { addLowPassFilter(freq1 + i*(freq2-freq1)/N); addHighPassFilter(freq2 + i*(freq2-freq1)/N); } } joinroundrobin(); } add Adder(); add Speaker(); } AtoD FMDemod Duplicate LPF1 LPF2 LPF3 HPF1 HPF2 HPF3 RoundRobin Adder Speaker

The StreamIt Language • Applications • DES and Serpent [PLDI 05] • MPEG-2 [IPDPS 06] • SAR, DSP benchmarks, JPEG, … • Programmability • StreamIt Language (CC 02) • Teleport Messaging (PPOPP 05) • Programming Environment in Eclipse (P-PHEC 05) • Domain Specific Optimizations • Linear Analysis and Optimization (PLDI 03) • Optimizations for bit streaming (PLDI 05) • Linear State Space Analysis (CASES 05) • Architecture Specific Optimizations • Compiling for Communication-Exposed Architectures (ASPLOS 02 & 06, dasCMP 07) • Phased Scheduling (LCTES 03) • Cache Aware Optimization (LCTES 05) • Load-Balanced Rendering • (Graphics Hardware 05) • Migrating Legacy Code to a Stream Representation • Using a Dynamic Analysis (MICRO 07) AtoD FMDemod Duplicate LPF1 LPF2 LPF3 HPF1 HPF2 HPF3 RoundRobin Adder Speaker

Language Primitives Filter Splitter Joiner • pop N push M • roundrobin(1,1) • pop 2 push 1 • roundrobin(2,2) • roundrobin(N,M) Filter Model of computation also known as cyclo-static dataflow

Example: Video Compositing Source 1 Source 2 • roundrobin(1,1) 2 MultiplyPixels 1 Output

Transforming Windows of Data Input: O O O O O O O O L L A A L L A A L L A A HyphenatePairs Output: O O O O – – O O O O – – L L A A – – L L A A – – L L – – A A

Transforming Windows of Data Input: O O O O L A L A L A 3 1 4 2 Compressed Input: O L A Compressed Output: 6 3 L A – Output: O O – O O – L A – L A – L – A

Transforming Windows of Data Input: O O O O L A L A L A 3 1 4 2 Compressed Input: O L A 2 2 4 2 Coarsened,Expanded O O L A Compressed Output: 3 3 6 3 O O – L A – Output: O O – O O – L A – L A – L – A

General Case: Filters O I N D … … Filter Coarsen D’ = LCM (D, I) N’ = N – (D’ – D) O I N’ D’ … ..… Translate Filter N’’ = N’ – N % I O I N’’O/I D’O/I N’%I items … … … Filter

Splitting Streams Output: Input: 1 1 L A L A L A L A L A 1 1 4 1 Compressed Input: CompressedOutput: 8 2 L A L A L A L A L A 4 1

Splitting Streams Output: Input: 2 2 L A L A L A L A L A 2 2 Compressed Input: L A

Splitting Streams 2 2 Coarsened, Expanded Input: 4 2 CompressedOutput: 6 4 L A L A L A L A L A 2 2

Splitting and Joining: Transpose O O O O O O O O 1 4 1 4 X O O O X O O O

Splitting and Joining: Transpose 3 1 3 1 O O O O O 1 4 1 4 X O O O X O 1 2 1 2

Splitting and Joining: Transpose 3 1 3 1 3 1 3 1 O O O O O 1 4 4 1 4 X O X O X O 2 1 1 2 1 2 2

General Case: Joiners N1 D1 D1(W1+W2) … … N’ W1 W1 … … N2 D2 W2 … … IfD1%W1=0 and D2%W2=0 and D1/W1=D2/W2

Implementation • Implemented subset of transformations in StreamIt • User can change graph connectivity + filter functions • Supported file format: Apple Animation (part of .MOV) • Standard format for interchange of lossless video • Compression: Run-length encoding within a line + difference encoding between frames • Emit executable plugins for MEncoder and Blender • Allows integration with standard video editing workflow 1 2 1 1-to-1 joinerwith 2-to-1 filter 1 1 1-to-1 filter 1

Experimental Methodology • Evaluated on 12 videos drawn from Internet video, computer animation, and stock digital television content • Two classes of transformations: 1. Color adjustment: inverse, brightness, contrast 2. Composite transformations: alpha-under, multiply + = alpha under x =

Results: Execution Time Color Adjustment: - 2.5x to 471x (median 17x) Compositing: - 1.1x to 32x (median 6.6x) Compression factor was low (≤1.1x) for one of source videos Compression Factor Following Re-compression

Results: File Bloat Masked out areasnot re-compressed Saturated colors not re-compressed Compression Factor Following Re-compression

Opportunity: Ignoring “Dead” Data • Some pixels in composite frames do not depend on both input frames • Example: digital television mask (a low-performance case) • If two data streams are multiplied, and one of them is repeatedly zero, then the repeat can be copied to the output (regardless of the values in the other stream) • We expect this would fix performance of our outlier cases • Requires pattern matching on stream graph x =

Extension to Other File Formats • High-efficiency mappings • Flic Video • Microsoft RLE • Targa (with run-length encoding) • Medium-efficiency mappings • Open EXR • Planar RGB  Re-arranges data by color or by byte • Low-efficiency mappings • ZIP • GZIP • PNG  Performs Huffman coding prior to LZ77

Conclusions • New method for direct processing of lossless-encoded data streams • Relies on LZ77 compression and stream programming model • Supports operations on windows of data • Supports splitting, joining, and reordering data • Preliminary implementation in an automatic compiler • Writeprogramonuncompresseddata,runoncompresseddata • Good speedups in the context of video processing • 15x speedup (median) on color adjustment and compositing • Across 12 videos in Apple Animation format • May prove useful as more content authored in lossless formats • Scope for extending technique, finding new applications

Extra Slides

General Case: Splitters N D U … … Split V Coarsen D’=LCM(D,U+V) N’ = N – (D’ – D) N’ D’ U … ..… Translate Split V N’’=N’–N%(U+V) N’’VU+V D’V U+V N’%(U+V) items U … Split … … V

ACM Multimedia October 20, 2009

ACM Multimedia October 20, 2009

Presentation Transcript

ACM SIGCSE 2004: Multimedia Construction Projects

Christine Braddock 20 October 2009

October 20, 2009

Rules-Ed Session October 20, 2009

Presentation to DPSG October 20, 2009

October 20, 2009

P20 Updates October 20, 2009

ASE seminar 20-21. October 2009

ACM Special Interest Group on Multimedia (SIGMM )

Multimedia Randy Bryant CS 740 October 20, 1998

Presentation to DPSG October 20, 2009

Faculty prep session October 20, 2009

CER Offices, Tallaght 20 October 2009

Profile 20 October 2009

ACM Multimedia 2004

Portage County October 20-21, 2009

October 20 th , 2009

Transportation Research Board October 20, 2009

Presentation to DPSG October 20, 2009

CER Offices, Tallaght 20 October 2009

ACM Special Interest Group on Multimedia (SIGMM )

ACM Multimedia 2004