320 likes | 437 Views
ONR Presentation August 4, 2005. John Villasenor villa@icsl.ucla.edu David Choi, Hyungjin Kim, Dong-U Lee dschoi@icsl.ucla.edu , hjkimnov@ee.ucla.edu , dongu@icsl.ucla.edu. Overview. Focus of work to date Robust imagery Automatic adaptation to network parameters
E N D
ONR PresentationAugust 4, 2005 John Villasenor villa@icsl.ucla.edu David Choi, Hyungjin Kim, Dong-U Lee dschoi@icsl.ucla.edu, hjkimnov@ee.ucla.edu, dongu@icsl.ucla.edu UCLA ONR
Overview • Focus of work to date • Robust imagery • Automatic adaptation to network parameters • Region of interest (ROI) coding, including integration of target tracking information • Improved imagery using system-wide optimizations including channel coding, image coding, receiver signal detection • Several generations of embedded hardware platform, integration and deployment on helicopter platforms. UCLA ONR
Recent Efforts • Multi-layered video streams with a base and enhancement later • Region of interest coding using a base/enhancement layer system • Reduced complexity image representation for environments with severe power constraints • Inherently secure encoding • System-level optimizations (current focus: timing recovery) to improve end to end imaging capabilities UCLA ONR
Enhancement Layer Video • Concept: In networks with communication links of varying bandwidths and capacities, • Send base layer video to all clients using legacy standards-based video codec implementation • Send enhancement layer video selectively to clients that can support the additional bandwidths. High Bandwidth Client base Video Encode Camera enhancement base Low bandwidth Client base base Low bandwidth Client enhancement base High Bandwidth Client base Low bandwidth Client Low bandwidth Client *in collaboration with Innovative Concepts UCLA ONR
Enhancement Layer Encoding/Decoding • Encoder: Leverage standards-based encoding platform Compressed Base Layer Base Layer Frame Difference Frame Original Frame Video Encoder Video Encoder Video Decoder - Compressed Enhancement Layer Decoder: Compressed Base Layer Base Layer Frame Recovered Frame Video Decoder + Compressed Enhancement Layer Enhancement Layer Frame Video Decoder UCLA ONR
Enhancement Coding Example • Enhancement layer coding can provide baseline imagery to low bandwidth clients and higher quality imagery to high bandwidth clients Base Layer Frame Base with Enhancement Combined Frame Enhancement Layer Frame UCLA ONR
Improvement in Video quality through Enhancement Layer Video PSNR vs. Base + Enhancement bitrate Points along a curve show improvement through enhancement layer Different curves represent different starting base layer bitrates Simulation performed with 320x240 video UCLA ONR
PSNR Difference Plot Difference Plots Plots show PSNR improvement of using Enhancement layer video over that of video re-encoded at same rate, with respect to the enhancement layer bitrate UCLA ONR
Enhancement layer region of interest • Application of enhancement layer coding concept to region of interest coding. • High quality information region of interest may be sent when bandwidth becomes available to certain network connections. UCLA ONR
Image representation using reduced energy processing • Use edge information to convey scene and location context • Provides significant scene information while dramatically reducing energy consumption and memory utilization • Simple, efficient compression algorithms can be applied original image edge detected image UCLA ONR
Generalized Gaussian Source Generalized Gaussian Source pdf: where C1, C2, and are given by is the shape parameter = 2 Gaussian = 1 Laplacian UCLA ONR
Generalized Gaussian Source • Choosing the “shape parameter” ν allows representation of a wide range of pdfs UCLA ONR
Golomb Rice (GR) and exponential Golomb (EG) Codes • GR and EG codes are classes of Huffman codes that have highly regular structure • There is no need for an explicit codebook; the codebook is implicit in the choice of the code • EG codes are particularly well suited to coding of image data that has been processed and then quantized or thresholded. UCLA ONR
Structure of Golomb Rice Codes • Code Trees k=1 k=2 k=3 Index Codeword Index Codeword Index Codeword 0 1 00 1 1 01 2 1 10 3 1 11 4 01 00 5 01 01 6 01 10 7 01 11 8 001 00 9 001 01 0 1 000 1 1 001 2 1 010 3 1 011 4 1 100 5 1 101 6 1 110 7 1 111 8 01 000 9 01 001 0 1 0 1 1 1 2 01 0 3 01 1 4 001 0 5 001 1 6 0001 0 7 0001 1 8 00001 0 9 00001 1 Prefix:# of zeros describe “depth” in tree Suffix: describes which branch for a particular depth UCLA ONR
Structure of Exponential Golomb Codes Code Trees k=2 k=1 k = 0 Index Codeword Index Codeword Index Codeword 0 01 0 1 01 1 2 001 00 3 001 01 4 001 10 5 001 01 6 0001 000 7 0001 001 8 0001 010 9 0001 011 0 001 00 1 001 01 2 001 10 3 001 01 4 0001 000 5 0001 001 6 0001 010 7 0001 011 8 0001 100 9 0001 101 0 1 1 01 0 2 01 1 3 001 00 4 001 01 5 001 10 6 001 01 7 0001 000 8 0001 001 9 0001 010 UCLA ONR
Coding Efficiency for Generalized Gaussian Source Figure a, v=0.7 Figure b, v=.3 Efficiency of Golomb-Rice and exp-Golomb codes for positive discrete sources derived from a GG, showing effect of code choice and code parameter k. Figure a shows case for v=.3. Curves for v=.7 are given in Figure b. The ratio δ/σ conveys the effect if samples from a generalized Gaussian source with standard deviation σis quantized using step size δ. UCLA ONR
Histogram of Data from Edge Detected Image Coding efficiency = 92% UCLA ONR
Complexity of Edge + EG coding • Complexity of edge detection combined with EG coding is significantly less than video coding using DCT and motion compensation • Edge Detection Algorithm • 4 shifts, 11 adds, 2 abs() per pixel • Exp Golomb Code • Codeword can be easily generated using state machine • Number of operations depends on the run-length statistics, i.e. for each additional bit used in the prefix, require an additional 2 shifts, 2 additions. • In the typical images we have observed, the coding takes under 0.2 add and 0.2 shifts per pixel • Total • Dominated by edge processing; e.g. 4 shifts, 11 adds, 2 abs() per pixel UCLA ONR
Complexity of Video coding • The two most computationally expensive steps of video coding are DCT and motional compensation calculations • DCT • For each 8x8 DCT block, well optimized implementation uses 536 additions, 192 multiplies per block • For 640x480, there are 80x60 = 4,800 blocks • Total of 2,572,800 additions, 2,572,800 multiplies per frame • Average of 8.4 additions, 3 multiplies per pixel, plus associated memory fetches/stores UCLA ONR
Complexity Analysis of Video Coding • Motion Search • For each MxM block, and offset of L, need • (L+1)^2 total number of offsets • 2xMxM addition operations per offset • MxM memory fetches per offset • E.g. 16x16 block with offset of 16 • 1089x 2 x 16 x 16 = 557,368 additions • 1089 x 16 x 16 = 278,784 memory fetches • For 640x480, there are 1,200blocks • Total of 668,841,600 additions per frame • Average of 2,177additions, 1089 memory fetches per pixel • Reducing the search range to an offset of 8 would result in 578 additions, 289 memory fetches per pixel. • Entropy Coder • Complexity of H.263’s entropy coder depends on specific image, etc.. • In the case of the use of Huffman coding with DCT coefficients, requires additional memory overhead for lookup tables. • H.264 Main Profile (CABAC) would involve significant extra costs due to arithmetic coding UCLA ONR
Comparison • The energy cost for multiplies relative to adds varies with processor, implementation, etc., but an order of magnitude is a reasonable approximation • Edge+EG required 11 adds/pixel (disregard shifts and abs() as these are nearly free) • Video encoding required between approximately 500 and 2000 adds/pixel, dominated by motion compensation • Difference of approximately 2-3 orders of magnitude in energy cost • Can reduce video coding burden using sub-optimal fast motion searches, etc., but reductions will still leave edge+EG at an energy advantage of several orders of magnitude UCLA ONR
Hybrid system representing object edges, texture in ROI • Energy-reducing benefits of edge-based representation • Image quality benefits of traditional video coding in a region of interest • Example: Total frame size 640 by 480 • ROI: 200 by 160 • Reduces overall energy consumption by approximately an order of magnitude UCLA ONR
0 1 A B AA AB BA BB AAA AAB ABA ABB BAA BAB BBA BBB Secure arithmetic coding • Used in many coding algorithms JPEG2000 (still image) and H.264 (video) • Associates a sequence of symbols with a position in the range [0,1) • Enables high coding efficiency • Not secure as traditionally implemented • Recursive partitioning is used prior to encoding of each new symbol • Every position on [0,1) is associated with a unique symbol string (binary example shown below) One symbol Two symbols Three symbols UCLA ONR
Key k0 identifies where the interval is to be split The portion of the A interval to the right of the key is moved to the right of the B interval B is unchanged. A representation in the interval B will have the same codeword length as in traditional AC A has two subintervals: at most one more bit is needed relative to traditional AC, though in some cases there is no increase in length Interval splitting with one symbol The result of interval splitting with 1 symbol UCLA ONR
Interval ordering diversity key = [0.45 0.23] key = [0.85 0.40] key = [0.25 0.78] UCLA ONR
End-to-end quality optimization: High level block diagram Transmitter Receiver Transmission Image input Image coder Channel Coder Image decoder Timing Recovery Channel Decoder UCLA ONR
Timing Recovery • Symbol timing errors occur at receivers due to clock differences, Doppler effects etc. • Traditional method: use simple PLL-based circuits • Our approach: data-aided iterative timing recovery • use information from LDPC decoder at each iteration to assist synchronization UCLA ONR
Block Diagram Timing Recovery Loop UCLA ONR
Bit Error Rate without LDPC feedback with LDPC feedback UCLA ONR
Timing Recovery: Demonstration • AWGN channel at EbNo=2dB and random phase offsets Without LDPC Feedback BER = 10-0.7 With LDPC Feedback BER = 10-1.5 UCLA ONR
Conclusions • Traditional assumption that “efficient” representation of imagery means maximizing compression needs to be re-examined. Maximizing energy efficiency can be more important than maximizing compression efficiency • Need methods to convey scene content that reflect power, bandwidth, memory and transmission reliability characteristics, limitations, and statistics of a given platform and environment • Alternative scene/object representation methods, specifically aimed at low energy with no attempt at esthetic quality, hold promise. Contrast with (mostly failed) previous attempts at “object-based” coding UCLA ONR
Conclusions (continued) • Low power imaging sensors and networks of such sensors likely to be critical • Local processing critical, realistic collaboration also has potential • Low power imaging event detection strategies needed; can’t simply be doing high energy image processing continuously while waiting for events which may occur rarely • Additional challenge in appropriately determining which information to convey, when, to whom, and how to convey it • Need proper balance of autonomous and human management of imaging networks, proper balance of video vs. still imagery, resolution vs. rate, etc. • Approx $20K forecast to remain as of Oct 05 UCLA ONR