320 likes | 526 Views
Mid-level Representations for Images and Videos. Sylvain Paris, Adobe with Jiawen Chen , Michael Cohen, Frédo Durand, Wojciech Matusik , Jue Wang . Making good videos is hard, much harder than taking good photos. Better tools will help.
E N D
Mid-level Representationsfor Images and Videos Sylvain Paris, Adobe withJiawenChen,Michael Cohen, Frédo Durand, WojciechMatusik, Jue Wang
Making good videos is hard, much harder than taking good photos.Better tools will help.
Today's focus:How to represent video datato build better algorithms.
Outline • Mid-level representations • Bilateral grid • Video mesh
Low-level Representation R=255 G=50 B=35
High-level Representation building car street
Mid-level Information texture smooth edge
Mid-level Representations • Data are indexed by (x,y,t,r,g,b) • (x,y,t): location. For video, it's a grid. • (r,g,b): color. That is where the information is. • Many representations exist: HSV, Lab, [Kim 09]... • We are going to look at (x,y,t,r,g,b).
Outline • Mid-level representations • Bilateral grid • Video mesh
Example: the Edge-aware Brush • Classical paint brush • Ignores edges • Our edge-aware brush • Respects edges Stroke with classical brush Input image Stroke with bilateral brush
Bilateral Grid – Definition y Standard 2D Image • Bilateral grid = 3D array • x and y correspond to pixel position • z corresponds to pixel intensity • Euclidean distance accountsfor edges • space distance (x,y) andintensity distance (z) • Grid can be coarsely sampled • E.g., 70 x 70 x 10 for an8 megapixel image x Bilateral Grid
Bilateral grid enables aggressive downsampling Extra dimension preserves edges 2D vs. Bilateral Grid Downsampling y • Nearest neighbor • arbitrarily bright or dark • Bicubic • intermediate value not in original x Downsampling in 2D
Bilateral grid enables aggressive downsampling Extra dimension preserves edges 2D vs. Bilateral Grid Downsampling y x Downsampling in 2D Bilateral Grid
Bilateral Grid Painting • When mouse is held down, paint only at intensity level of initial mouse click Input image Bilateral Grid
Bilateral Grid Painting • Edge-aware brush used to change hue
Bilateral Filter [Tomasi 98] • Smooth image except across strong edges • Ubiquitous in computational photography [Oh 01, Durand 02, Eisemann 04, Petschnigg 04, Bennett 05, Bae 06, Fattal 07, Kopf 07, …] FilteredOutput Input
intensity space Bilateral Filter on the Bilateral Grid Image scanline BilateralGrid
intensity intensity space space Bilateral Filter on the Bilateral Grid Image scanline BilateralGrid Gaussian blur grid values Slice: query gridwith input image Filtered scanline
Model Input Output Many Operations and Applications • Local histogram equalization • Interactive tone mapping • Video abstraction [Winnemoller 06, DeCarlo 02] • Segmentation [Paris 07] • Photographic style transfer [Bae 06]
Works great frame by frame if no noise. What if there is noise?Not everything can be done frame by frame, e.g. segmentation.
Frame by Frame • Consider only the current frame. • Do not exploit all the available information. unused unused
Volumetric, Off-line • Good if we know the entire sequence in advance. • Time is the same as x and y. • Use a time Gaussian.
Causal, On-line • Good for real time • recursive algorithm • Formal equivalence between exp. decay and Gaussian[ECCV 08] fixed unused
Summary on Bilateral Grid • Image data as a coarse volumetric grid. • Edges are naturally represented. • Standard operations become edge-aware for free. • Fast: HD video in real time • Grid good up to 5D/6D, e.g. (x,y,t,r,g,b) • See work by Adams et al. at Stanfordfor higher dimensions.
Outline • Mid-level representations • Bilateral grid • Video mesh
Video Meshes • Coarse data: • motion, depth • represented by a triangle mesh • Finely detailed data: • colors, boundaries • represented by RGBA textures
Conclusion • Mid-level cues carry a lot of information • Can be "embedded" in image representation • Fast, easy-to-adaptalgorithms • Many existing structures • Unwrap mosaics, locally rigid triangles, edge-aware wavelets àtrous, video front…