810 likes | 1.2k Views
Digital Video. Digital video is essentially a sequence of digital images Processing of digital video has much in common with digital image processing First we review the basic principles of analog television. Television Fundamentals.
E N D
Digital Video • Digital video is essentially a sequence of digital images • Processing of digital video has much in common with digital image processing • First we review the basic principles of analog television
Television Fundamentals • Color television cameras and television receivers use the RGB (red, green, blue) color system to create any color • We have seen how raster scan devices operate • Commercial television systems, however, may use interlaced scanning in addition to the progressive scanning of computer monitors
Television Fundamentals • In interlaced scanning, one half of the horizontal scan lines (every other line) are transmitted and “drawn” by the receiver • Then the other half of the lines are transmitted and are drawn in between the first scan lines • Each half is known as a field, and two fields together are known as a frame
Television Fundamentals • Since the phosphors retain their values for longer than the time that it takes to transmit two fields, and since rate of transmission of a field is shorter than the human eye can perceive, the viewer does not perceive this interlacing • If the frame rate is at least 25-30 frames per second the viewer does not perceive motion in an image sequence as discrete, but as continuous
Interlaced Scanning • In the figures, the first field is transmitted at time t = 0 and displayed at time t = f / 2 • f is the frame rate • The second field is transmitted at time t = f / 2 and displayed at time t = f • Note that the display at time t = f consists of information (scan lines) from two distinct points in time
Interlacing • Interlaced scanning was used in commercial television systems to decrease the bandwidth of the transmitted signal and to reduce the phenomenon known as large area flicker • These problems had been overcome by the time bit-mapped computer monitors were being developed
Deinterlacing • There are several common operations that you might want to perform on interlaced video • Producing stills • resizing the video • changing the frame rate etc. • Performing these operations on raw, interlaced, video can produce undesirable artifacts
Deinterlacing • Deinterlacing provides a way around these problems • All deinterlacing methods involve turning the field-based image into a frame-based image by modifying one of the fields in the image • Popular methods include duplication and interpolation
Deinterlacing • Duplication • Interpolation
Television Systems • The exact frame rate depends on the system as does the number of scan lines per frame • Historically, three conventional commercial television systems enjoyed widespread use throughout the world • North America, South America and Japan used NTSC • The United Kingdom, Western Europe, Africa and Australia used PAL • France, Eastern Europe and Russia used SECAM
Digital Television • Considering the bandwidth of the NTSC signal, how does digital transmission compare to analog? We have: • 30 frames/second x 130,000 pixels/frame x 24 bits/pixel = 93.6 Mbits/second • To be competitive with analog transmission, a data compression of more than 20:1 is required • All digital television standards therefore include some form of compression. • The disadvantage of digital television therefore, is the extra bandwidth required
Digital Television • The real advantage may be seen by examining the signal-to-noise ratio of digital vs. analog television • This figure shows the approximate ratio of error rate to signal-to-noise ratio for digital transmission
Digital Television • An error rate of 10-8 or one bit in 100 million bits is practically undetectable • Channel error rates of 10-5 still permit acceptable pictures, especially if error correction techniques are used • An analog TV signal requires a channel with a signal-to-error ratio (SER) of 55dB
Digital Television • If we use PCM for a digital television signal, the principal source of error is due to quantization • The error is a maximum of + or - 1/2 the least significant bit • For a quantization level of 8 bits, this is + or - 0.2% • This “fine” quantization would appear as white noise if viewed as a picture
Digital Television • Theoretically, the SER with 8 bits is 59 dB and for each 1 bit reduction in quantization, the SER is reduced 6 dB • The actual SER of a composite color TV signal is about 4 dB less • Thus, 8-bit PCM encoding of a noise-free NTSC composite color signal yields a SER of 55 dB
Digital Television • A bit error rate of 10-8 is practically undetectable • From the figure above, this requires a SER of only 21 dB • If we use the rate 10-5 with error correction bits added, a SER of 18 dB may be sufficient • This requires less than 1 bit/pixel • The essential problem in digital TV coding is therefore to reduce the picture bandwidth at the expense of the bit error rate and retain acceptable picture quality
Aspect Ratios • Each of the systems listed above has an aspect ratio (ratio of width to height) of 4:3 • Cinematic films and high-definition television (HDTV) systems have aspect ratios of approximately 16:9
Aspect Ratios • 4:3 aspect ratio • 16:9 aspect ratio
Compatible Color TV • In order to permit compatibility of color TV transmission with preexisting black and white receivers, the RGB image generated by a television camera is converted to a YIQ image by using the transform
Compatible Color TV • See book and notes for more info on YIQ • The bandwidth allocated to a black and white television signal is illustrated on the next slide
Compatible Color TV • In order to maintain compatibility, the color TV signal has to fit in the same bandwidth • This is accomplished by first combining the I and Q signals using a method called quadrature modulation • The two signals are multiplied by a sine and cosine function, respectively, added and become a single composite signal • The second idea is to choose the color subcarrier to be an odd multiple of one half the line frequency • The resulting bandwidth allocation is illustrated on the next slide
Compatible Color TV • At the receiver, the inverse transformation given on the next slide reforms R, G, B from the received Y’, I’, Q’ signals
Pixel Aspect Ratio • When we are displaying a digital video stream on a standard television receiver, we have another parameter to consider - the pixel aspect ratio • This is related to the aspect ratio of the television screen and to the sampling rate
Pixel Aspect Ratio • If we have a screen with an aspect ratio of 4:3 and we have a digital image of size 711x487, then in order to maintain the 4:3 aspect ratio we must have a pixel aspect ratio p, where p can be found as follows. • 3/4 = 487/711 * p • p = 0.75 * 711/487 = 1.0950 • Computer monitors generally have pixel ratios of 1.0 (square pixels) • DTV standards may have several non-square pixel ratios
Aspect Ratio Conversion • We are given a video sequence with an aspect ratio of 16:9 and we want to display the sequence on a device with an aspect ratio of 4:3 • For example, the source image may be 640x360 and the display device may have a resolution of 480x360 • We have several alternatives
Aspect Ratio Conversion • In the letterbox technique, we scale the source image by the same amount in both the vertical and horizontal directions: 480/640=.75 • First, we create a 480x270 image: • S2[i,j] = S1[INT(4/3*i),INT(4/3*j)] • where S1 is the original image and INT is a function which rounds a floating point value to the nearest whole number
Aspect Ratio Conversion • We now use this image to form the target 480x360 image S3 as follows: • S3[i,j] = S2[i,j-45] for 45≤j≤314 • S3[i,j] = 0 for 0≤j≤44, 315≤j≤359 • The second line gives the black bars of the letterbox format • In horizontal compression, we map the source image so that it exactly fills screen of the target device • This results in a stretching of the source image in the vertical direction • The transformation is: • S2[i,j] = S1[INT(4/3*i),j]
Aspect Ratio Conversion • The crop and pan transformation is different from the previous two in that the transformation varies over time, depending on the contents of the source image • We will basically be showing one of the three transformed images: • S2[i,j] = S1[i+80,j] • S3[i,j] = S1[i,j] • S4[i,j] = S1[i+160,j]
Aspect Ratio Conversion • S2 is the center 3/4 of the source image, S3 is the left 3/4 of the source image and S4 is the right 3/4 of the source image • If there is an object of interest in the leftmost 1/4 of the source image, we use the transformation S3 • if there is an object of interest in the rightmost 1/4 of the source image, we use S4 • otherwise, we use S2 • We view the original video sequence to determine when we need to focus on the corners of the source image, and we define the transformation accordingly
Aspect Ratio Conversion • The use of only these three transformations will lead to jerkiness when we shift from one transform to another • We can get the effect of a camera panning by making a time-dependent transformation • For example, imagine we want to pan from the center view to the right view over some period • We can define the transformation as follows: • S5[i,j,t] = S1[i+80+INT(80*(t-t1)/(t2-t1)),j,t], t1≤t≤t2 assuming that there is no parallel change of frame rate
Sample rate conversion • At times, it will be necessary to convert the sampling rate in a source signal to some other sampling rate • Consider converting from a CCIR 601 signal (a digital video standard) to an MPEG SIF signal • MPEG is a compression standard for video • SIF is the source input format for the compression) • CCIR 601 is an interlaced signal with a 50 Hz field rate • The signal consists of three components: Y, U and V
Sample rate conversion • The Y component (luminance) is sampled at a resolution of 720x480 • The U and V components (chrominance) are sampled at a resolution of 360x480 • The sampling pattern is shown on the next slide
Sample rate conversion • MPEG SIF samples luminance at a resolution of 360x240 and chrominance at a resolution of 180x120 • The sampling pattern is different as well, as shown on the next slide
Sample rate conversion • The conversion to the lower sampling resolution begins with the discarding of one of the interlaced fields • This reduces the picture rate to 25 Hz (non-interlaced) and reduces the vertical resolution by one half
Sample rate conversion • Now, the luminance is decimated by one half in the horizontal direction • One possibility is to subsample the values, however better results are obtained by applying an FIR filter before subsampling • One filter which has been found to give good results in decimating the luminance is shown on the following slide
Sample rate conversion • The results of multiplying the filter weights by the original values is divided by 256 • Use of a power of two allows a simple hardware implementation • An example of the use of this filter followed by subsampling is shown on the next slide • At the ends of lines, some special technique such as renormalizing the filter or replicating the last pixel must be used • In the example below, the data in the line is reflected at each end
Example of filtering and subsampling of a line of pixels 10 12 20 30 35 15 19 11 11 19 26 45 80 90 92 90 32 32 23 9 12 49 95 95 35*-29+15*0+19*88+11*138+11*88+19*0+26*-29 -1015+0+1672+1518+968+0-754=2389/256=9