340 likes | 666 Views
CS 414 - Spring 2008. Administrative . Group Directories will be established by Friday, 1/25MP1 will be out on 1/25. Color and Visual System. Color refers to how we perceive a narrow band of electromagnetic energysource, object, observerVisual system transforms light energy into sensory experien
E N D
1. CS 414 - Spring 2008 CS 414 Multimedia Systems Design Lecture 5 Digital Video Representation Klara Nahrstedt
Spring 2008
2. CS 414 - Spring 2008 Administrative Group Directories will be established by Friday, 1/25
MP1 will be out on 1/25
3. Color and Visual System Color refers to how we perceive a narrow band of electromagnetic energy
source, object, observer
Visual system transforms light energy into sensory experience of sight
4. Human Visual System Eyes, optic nerve, parts of the brain
Transforms electromagnetic energy
5. Human Visual System Image Formation
cornea, sclera, pupil,iris, lens, retina, fovea
Transduction
retina, rods, and cones
Processing
optic nerve, brain
6. Retina and Fovea Retina has photosensitive receptors at back of eye
Fovea is small, dense region of receptors
only cones (no rods)
gives visual acuity
Outside fovea
fewer receptors overall
larger proportion of rods
7. Transduction (Retina) Transform light to neural impulses
Receptors signal bipolar cells
Bipolar cells signal ganglion cells
Axons in the ganglion cells form optic nerve
8. Rods vs Cones Contain photo-pigment
Respond to low energy
Enhance sensitivity
Concentrated in retina, but outside of fovea
One type, sensitive to grayscale changes
Contain photo-pigment
Respond to high energy
Enhance perception
Concentrated in fovea, exist sparsely in retina
Three types, sensitive to different wavelengths
9. Tri-stimulus Theory 3 types of cones (6 to 7 million of them)
Red (64%), Green (32%), Blue (2%)
Each type most responsive to a narrow band
red and green absorb most energy, blue the least
Light stimulates each set of cones differently, and the ratios produce sensation of color
10. Visual System Facts Distinguish hundreds of thousands of colors
more sensitive to brightness
Can distinguish about 28 fully saturated hues
less sensitive to hue changes in less saturated colors
Can distinguish about 23 levels of saturation for fixed hue and lightness
10 times less sensitive to blue than red or green
it absorbs less energy in the blue range
11. Color Perception Hue
distinguishes named colors, e.g., RGB
dominant wavelength of the light
Saturation
how far color is from a gray of equal intensity
Brightness (lightness)
perceived intensity
12. Spatial Resolution (depends on: )
Image size
Viewing distance
Brightness
Perception of brightness is higher than perception of color
Different perception of primary
colors
Relative brightness: green:red:blue=59%:30%:11%
B/W vs. Color CS 414 - Spring 2008 Visual Perception: Resolution and Brightness
13. Visual Perception: Temporal Resolution CS 414 - Spring 2008 Effects caused by inertia of human eye
Perception of 16 frames/second as continuous sequence
Special Effect: Flicker
14. Temporal Resolution Flicker
Perceived if frame rate or refresh rate of screen too low (<50Hz)
Especially in large bright areas
Higher refresh rate requires:
Higher scanning frequency
Higher bandwidth
CS 414 - Spring 2008
15. Visual Perception Influence Viewing distance
Display ratio (width/height 4/3 for conventional TV)
Number of details still visible
Intensity (luminance)
CS 414 - Spring 2008
16. Television History 1927, Hoover made a speech in Washington while viewers in NY could see, hear him
AT&T Bell Labs had the first television
18 fps, 2 x 3 inch screen, 2500 pixels
17. Television Concepts Production (capture)
2D array of light energy to
electrical signals
signals must adhere to known, structured formats
Representation and Transmission
popular formats include NTSC,
PAL, SECAM
Re-construction
CRT technology and raster scanning
display issues (refresh rates, temporal resolution)
relies on principles of human visual system
18. Video Representations Composite
NTSC - 6MHz (4.2MHz video), 29.97 fps
PAL - 6-8MHz (4.2-6MHz video), 25 fps
Component
Maintain separate signals for color
Color spaces
RGB, YUV, YCRCB, YIQ
19. Color Coding: YUV PAL video standard
Based on CIE model
Y is luminance
UV are chrominance
YUV from RGB
Y = .299R + .587G + .114BU = 0.492 (B - Y)V = 0.877 (R - Y)
20. YCrCb Subset of YUV that scales and shifts the chrominance values into range 0..1
Y = 0.299R + 0.587G + 0.114BCr = ((B-Y)/2) + 0.5Cb = ((R-Y)/1.6) + 0.5
21. YCC Example
22. YIQ NTSC standard
Similar to YUV, but rotated 33 degrees
YIQ from RGB
Y = .299R + .587G + .114B I = .74 (R - Y) - .27 (B - Y)Q = 0.48 (R - Y) + 0.41 (B - Y)
23. YIQ 4:2:2
24. YIQ 4:1:1
25. NTSC Video 525 scan lines per frame; 29.97 fps
33.37 msec/frame (1 second / 29.97 frames)
scan line lasts 63.6 usec (33.37 msec / 525)
aspect ratio of 4/3, gives 700 horizontal pixels
20 lines reserved for control information at the beginning of each field
so only 485 lines of visible data
26. NTSC Video Interlaced scan lines divide each frame into 2 fields, each of which is 262.5 lines
phosphors in early TVs did not maintain luminance long enough (caused flicker)
scanning also interlaced; can cause visual artifacts for high motion scenes
27. HDTV Digital Television Broadcast (DTB) System
Twice as many horizontal and vertical columns and lines as traditional TV
Resolutions:
1920x1080 (1080p) Standard HDTV
Frame rate: options 50 or 60 frames per second
CS 414 - Spring 2008
28. Pixel Aspect Ratio CS 414 - Spring 2008 pixel aspect ratio is used in the context of computer graphics to describe the layout of pixels in a digitized image. Most digital imaging systems use a square grid of pixelsthat is, they sample an image at the same resolution horizontally and vertically. But there are some devices that do not (most notably some common standard-definition formats in digital television and DVD-Video) so a digital image scanned at a vertical resolution twice that of its horizontal resolution (i.e. the pixels are twice as close together vertically as horizontally) might be described as being sampled at a 2:1 pixel aspect ratio, regardless of the size or shape of the image as a whole.
Increasing the aspect ratio of an image makes its use of pixels less efficient, and the resulting image will have lower perceived detail than an image with an equal number of pixels, but arranged with an equal horizontal and vertical resolution. Beyond about 2:1 pixel aspect ratio, further increases in the already-sharper direction will have no visible effect, no matter how many more pixels are added. Hence an NTSC picture (480i) with 1000 lines of horizontal resolution is possible, but would look no sharper than a DVD. The exception to this is in situations where pixels are used for a purpose other than resolution - for example, a printer that uses dithering to simulate gray shades from black-or-white pixels, or analog videotape that loses high frequencies when dubbed.
pixel aspect ratio is used in the context of computer graphics to describe the layout of pixels in a digitized image. Most digital imaging systems use a square grid of pixelsthat is, they sample an image at the same resolution horizontally and vertically. But there are some devices that do not (most notably some common standard-definition formats in digital television and DVD-Video) so a digital image scanned at a vertical resolution twice that of its horizontal resolution (i.e. the pixels are twice as close together vertically as horizontally) might be described as being sampled at a 2:1 pixel aspect ratio, regardless of the size or shape of the image as a whole.
Increasing the aspect ratio of an image makes its use of pixels less efficient, and the resulting image will have lower perceived detail than an image with an equal number of pixels, but arranged with an equal horizontal and vertical resolution. Beyond about 2:1 pixel aspect ratio, further increases in the already-sharper direction will have no visible effect, no matter how many more pixels are added. Hence an NTSC picture (480i) with 1000 lines of horizontal resolution is possible, but would look no sharper than a DVD. The exception to this is in situations where pixels are used for a purpose other than resolution - for example, a printer that uses dithering to simulate gray shades from black-or-white pixels, or analog videotape that loses high frequencies when dubbed.
29. CS 414 - Spring 2008
30. HDTV Interlaced and/or progressive formats
Conventional TCs use interlaced formats
Computer displays (LCDs) use progressive scanning
MPEG-2 compressed streams
In Europe (Germany) MPEG-4 compressed streams
CS 414 - Spring 2008
31. Aspect Ratio and Refresh Rate Aspect ratio
Conventional TV is 4:3 (1.33)
HDTV is 16:9 (2.11)
Cinema uses 1.85:1 or 2.35:1
Frame Rate
NTSC is 60Hz interlaced (actually 59.94Hz)
PAL/SECAM is 50Hz interlaced
Cinema is 24Hz non-interlaced
32. SMPTE Time Codes Society of Motion Picture and Television Engineers defines time codes for video
HH:MM:SS:FF
For NTSC, SMPTE uses a 30 drop frame code
increment as if using 30 fps, when really 29.97
defines rules to remove the difference error
Lets think about the error
lose .03 frames every second (30 29.97)
lose 108 frames every hour (.03 * 3600 sec / hour) SMPTE timecode is a set of cooperating standards to label individual frames of video or film with a timecode defined by the Society of Motion Picture and Television Engineers in the SMPTE 12M specification.
Timecodes are added to film, video or audio material, and have also been adapted to synchronize music. They provide a time reference for editing, synchronisation and identification. Timecode is a form of media metadata. The invention of timecode made modern videotape editing possible, and led eventually to the creation of non-linear editing systems.
SMPTE (pron :sim-tee) timecodes contains binary coded decimal hour:minute:second:frame identification and 32 bits for use by users. There are also drop-frame and colour framing flags and three extra 'binary group flag' bits used for defining the use of the user bits. The formats of other forms SMPTE timecodes are derived from that of the longitudinal timecode.
Time code can have any of a number of frame rates: common ones are
24 frame/s (film)
25 frame/s (PAL colour television)
29.97 (30*1.000/1.001) frame/s (NTSC color television)
30 frame/s (American black-and-white television) (virtually obsolete)
In general, SMPTE timecode frame rate information is implicit, known from the rate of arrival of the timecode from the medium, or other metadata encoded in the medium. The interpretation of several bits, including the "colour framing" and "drop frame" bits, depends on the underlying data rate. In particular, the drop frame bit is only valid for a nominal frame rate of 30 frame/s: see below for details.
More complex timecodes such as Vertical interval timecode can also include extra information in a variety of encodings.
SMPTE time code is a digital signal whose ones and zeroes assign a number to every frame of video, representing hours, minutes, seconds, frames, and some additional user/specified information such as tape number. For instance, the time code number 01:12:59:16 represents a picture 1 hour, 12 minutes, 59 seconds, and 16 frames into the tape.SMPTE timecode is a set of cooperating standards to label individual frames of video or film with a timecode defined by the Society of Motion Picture and Television Engineers in the SMPTE 12M specification.
Timecodes are added to film, video or audio material, and have also been adapted to synchronize music. They provide a time reference for editing, synchronisation and identification. Timecode is a form of media metadata. The invention of timecode made modern videotape editing possible, and led eventually to the creation of non-linear editing systems.
SMPTE (pron :sim-tee) timecodes contains binary coded decimal hour:minute:second:frame identification and 32 bits for use by users. There are also drop-frame and colour framing flags and three extra 'binary group flag' bits used for defining the use of the user bits. The formats of other forms SMPTE timecodes are derived from that of the longitudinal timecode.
Time code can have any of a number of frame rates: common ones are
24 frame/s (film)
25 frame/s (PAL colour television)
29.97 (30*1.000/1.001) frame/s (NTSC color television)
30 frame/s (American black-and-white television) (virtually obsolete)
In general, SMPTE timecode frame rate information is implicit, known from the rate of arrival of the timecode from the medium, or other metadata encoded in the medium. The interpretation of several bits, including the "colour framing" and "drop frame" bits, depends on the underlying data rate. In particular, the drop frame bit is only valid for a nominal frame rate of 30 frame/s: see below for details.
More complex timecodes such as Vertical interval timecode can also include extra information in a variety of encodings.
SMPTE time code is a digital signal whose ones and zeroes assign a number to every frame of video, representing hours, minutes, seconds, frames, and some additional user/specified information such as tape number. For instance, the time code number 01:12:59:16 represents a picture 1 hour, 12 minutes, 59 seconds, and 16 frames into the tape.
33. Rules to Compensate Every minute, drop two frames
after 60 minutes, lose 120 frames
but this is 12 too many
To compensate, every ten minutes (0, 10, 20,
, 50), do not drop the two frames
saves 12 frames every 60 minutes
Results in a code that is easier to work with and amenable to computation Drop frame timecode dates to a compromise invented when color NTSC video was invented. The NTSC re-designers wanted to retain compatibility with existing monochrome TVs. However, the 3.58 MHz (actually 315/88 MHz = 3.57954545 MHz) color subcarrier would absorb common-phase noise from the harmonics of the line scan frequency. Rather than adjusting the audio or chroma subcarriers, they adjusted everything else, including the frame rate, which was set to 30*1.000/1.001 Hz.
This meant that an "hour of timecode" at a nominal frame rate of 30 frame/s was longer than an hour of wall-clock time by 3.59 seconds, leading to an error of almost a minute and a half over a day. This caused people to make unnecessary mistakes in the studio.
To correct this, drop frame SMPTE timecode - needing to drop 1 frame every thousand frames - drops frame numbers 0 and 1 of the first second of every minute, and includes them when the number of minutes is divisible by ten. This achieves an "easy-to-track" drop frame rate of 18 frames each ten minutes (18,000 frames @ 30fps) and almost perfectly compensates for the difference in rate, leaving a residual timing error of roughly 86.4 milliseconds per day, an error of only 1.0 ppm. Note: only timecode frame numbers are dropped. Video frames continue in sequence. i.e. - Drop frame TC drops two frames every minute, except every tenth minute, achieving 29.97fps.
Drop-frame timecode is used only in systems running at a frame rate of 30*1.000/1.001 Hz.
Drop frame timecode dates to a compromise invented when color NTSC video was invented. The NTSC re-designers wanted to retain compatibility with existing monochrome TVs. However, the 3.58 MHz (actually 315/88 MHz = 3.57954545 MHz) color subcarrier would absorb common-phase noise from the harmonics of the line scan frequency. Rather than adjusting the audio or chroma subcarriers, they adjusted everything else, including the frame rate, which was set to 30*1.000/1.001 Hz.
This meant that an "hour of timecode" at a nominal frame rate of 30 frame/s was longer than an hour of wall-clock time by 3.59 seconds, leading to an error of almost a minute and a half over a day. This caused people to make unnecessary mistakes in the studio.
To correct this, drop frame SMPTE timecode - needing to drop 1 frame every thousand frames - drops frame numbers 0 and 1 of the first second of every minute, and includes them when the number of minutes is divisible by ten. This achieves an "easy-to-track" drop frame rate of 18 frames each ten minutes (18,000 frames @ 30fps) and almost perfectly compensates for the difference in rate, leaving a residual timing error of roughly 86.4 milliseconds per day, an error of only 1.0 ppm. Note: only timecode frame numbers are dropped. Video frames continue in sequence. i.e. - Drop frame TC drops two frames every minute, except every tenth minute, achieving 29.97fps.
Drop-frame timecode is used only in systems running at a frame rate of 30*1.000/1.001 Hz.
34. Take Home Exercise Given a SMPTE time stamp, convert it back to the original frame number
e.g., 00:01:00:10
35. Summary Digitization of Video Signals
Composite Coding
Component Coding
Digital Television (DTV)
DVB (Digital Video Broadcast)
Satellite connections, CATV networks best suited for DTV
DVB-S for satellites (also DVB-S2)
DVB-C for CATV
CS 414 - Spring 2008