410 likes | 419 Views
Explore the process of detecting, modifying, and replacing video objects in post-TV production software. Learn about the challenges and solutions involved in enhancing video quality and altering content for various industries.
E N D
Video Object Tracking and Replacement for post TV production LYU0303 Final Year Project Fall 2003
Outline • Project Introduction • Basic parts of the purposed system • Working principles of individual parts • Future Work • Q&A
Introduction • A post-TV production software processes a video clip such that either the video quality improves or the content changes. • Reasons for changing the content of a video • Reduce video production cost • Performing dangerous actions • Producing effects those are impossible in reality • Especially important for advertisement and movie-making industries.
Introduction • Things appearing in the video are often separate to each other (e.g. books, boxes, humans, etc.), known as “video objects”. • If the video objects are going to be modified or be replaced by something else, they must be detected from the original video clips first. • The problem is, HOW to detect them?
Difficulties to be overcome • Video objects are mostly three-dimensional before they are being recorded to video clips. • Videos are sequence of continuous two-dimensional images. • Humans have no problem in recognizing the video objects out of a video clip. • Can computers do that also?
Possible solutions… • Computers cannot perform object detection directly because… • Image is processed byte-by-byte • Without pre-knowledge about the video objects to be detected • Result is definite, no fuzzy logic. • Though computers cannot perform object detection directly, it can be programmed to work indirectly.
Humans recognizes an object mainly by looking at its shape and color. Possible solutions…
Possible solutions… • If a computer can do similar things, then it can perform simple object detection. • The purposed post-TV production system has included several parts in order to guide the computers to deduce the presence of a video object step by step.
Basic parts of the purposed system • Simple bitmap reader/writer • RGB/HSV converter • Edge detector • Edge equation finder • Equation processor • Texture mapper
RGB/HSV converter • Human eyes are more sensitive to the brightness rather than the true color components of an object. • More reasonable to convert the representation of colors into HSV (Hue, Saturation and Value (brightness)) model. • After processing, convert back to RGB and save to disk.
RGB to HSV HSV to RGB RGB/HSV converter
Edge detector • Usually, a sharp change in hue, saturation or brightness means that there exist a boundary line. HSV: (0,0,0) HSV: (0,255,255)
Edge detector Before edge highlighting After edge highlighting
Edge detector • It will produce a list of points which are considered as “edge points” for further processing. • Both horizontal and vertical scanning. • During the edge point finding process, a two-dimensional array is used to record the points. • Can remove duplicate edge points.
Edge detector • Since there may be multiple parts in a single object, the input video may need to be processed several times. Part 1 Part 2 Part 3
Edge equation finder • Derives mathematical facts out of the edge points. • Works with simplified Hough Transform algorithm. • Automatically adjusts tolerance value to minimize the effect of noise points. • This helps when the edge is not completely straight or blurred.
Edge equation finder Angle in degree Frequency 0 1 45 3 90 1 135 1 (x1,y1) Desired linear equation in point-slope form:
Equation processor • Although the equation finder has chosen the most favorable tolerance value, some “extra” equation may still be generated due to the presence of noise points. • Geometrical facts of the video object may be included in order to remove these extra equations. • It is also possible to remove occultation parts with enough pre-knowledge.
Equation processor After edge and equation finding After extra equation removal Before edge finding
Equation processor • After the extra equations are removed, the coordinates of the corner points are calculated and estimated. • Corner coordinates are essential for future texture mapping and object motion tracking.
Basic parts of the purposed system • Simple bitmap reader/writer • RGB/HSV converter • Edge detector • Edge equation finder • Equation processor • Texture mapper
Texture Mapper • A graphics design process in which a 2-D surface, called a texture map, is "wrapped around" a 3-D object. • The 3-D object acquires a surface texture similar to the texture map.
Texture Mapper Mapping New position of pixel Original position of pixel
Texture Mapper • Every polygon is assigned 2 sets of coordinates • Image coordinates (r, c): location of pixel in the image • Texture coordinates (u, v): location in texture image which contains color information for image coordinates
Texture Mapper • Mapping functionsmap texture coordinates to image coordinates or vice versa. • They are usually determined by image points whose texture coordinates are given explicitly.
Texture Mapper (u1, v1) (u2, v2) (r1, c1) (u1, v1) (r2, c2) (u2, v2) (r4, c4) (u4, v4) (r3, c3) (u3, v3) (u4, v4) (u3, v3)
Texture Mapper • Scan conversion: the process of scanning all the pixels and perform the necessary calculation. • Forward mapping maps from the texture space to image space • Inverse mapping maps from the image space to texture space
Scan conversion with forward mapping • Algorithm: • for u = umin to umax • for v = vmin to vmix • r = R(u,v) • c = C(u,v) • copy pixel at source (u,v) • to destination (r,c)
Scan conversion with forward mapping • Advantage: Easy to compute as long as the forward mapping function is known. • Disadvantage Pixel-to-pixel mapping is not 1-1. Holes may appear. Can result in aliasing.
Scan conversion with inverse mapping • Algorithm: • for (r,c) = polygon pixel • u = TEXR(r,c) • v = TEXC(r,c) • copy pixel at source (u,v) • to destination (r,c)
Scan conversion with inverse mapping • Advantage: Every destination pixel is filled (no holes). Allow easy incorporation of pre-filtering & resampling operations to prevent aliasing
Scan conversion with inverse mapping • Take advantage of Scanline Polygon Fill Algorithm • For a row scan, maintain a list of scanline / polygon intersections. • Intersection at scanline r+1 efficiently computed from row r. {xk+1, yk+1} Scanline yk+1 Scanline yk {xk, yk}
Scan conversion with inverse mapping • Coordinates at a non-boundary level are computed by linearly interpolating (u,v) coordinates of bounding pixels on the scanline. {xk+1, yk+1} Scanline yk+1 Scanline yk {xk, yk}
Scan conversion with inverse mapping • Suppose (ri,ci) maps to (ui,vi), i = 1,…, 5 • (r4,c4) = s (r1,c1) + (1-s) (r3,c3) { s is known } • (u4,v4) = s(u1,v1) + (1-s)(u3,v3) {u4,v4 are known} • Similarly, (u5, v5) can be found. • t = (c-c4)/(c5-c4) • (r,c) = t*(u5,v5) + (1-t)*(u4,v4) (r1, c1) (r, c) Scanline yk (r4, c4) (r5, c5) image (r3, c3) (r2, c2)
Basic 2D linear mapping • Scaling & Translation u = ar + d v = bc + e upright rectangle upright square • Euclidean mapping u = (cos)r – (sin)c + d v = (sin)r + (cos)c + e rotated unit square upright square
Basic 2D linear mapping • Similarity mapping u = s(cos)r – s(sin)c + d v = s(sin)r + s(cos)c + e rotated square upright unit square • Affine mapping u = f(cos)r – g(sin)c + d v = h(sin)r + i(cos)c + e rotated rectangle upright unit square DEMO !
Basic 2D linear mapping • Projective mapping The most general 2D linear map Square arbitrary quadrangle ! • u = (a11r+a12c+a13) / (a31r+a32c+1) • v = (a21r+a22c+a23) / (a31r+a32c+1) • The 8 variables a11,a12, … , a32 have to be found out.
Basic 2D linear mapping • We have a system of 8 equations solving 8 unknowns. (x1,y1)
Future Work • Mapping cans • Speed optimization • Movie manipulation • Use of 3D markers
Q & A See the foot notes.