210 likes | 426 Views
ECE 408 Final Project. Fall 2013. Parameters. Groups of 3 preferred Groups of 1-2 possible w/ prior approval Look for a group on Piazza 2 Project options HEVC Intraframe Prediction Competition A topic from your own research. Competition. Intraframe prediction for HEVC video encoder
E N D
ECE 408 Final Project Fall 2013
Parameters • Groups of 3 preferred • Groups of 1-2 possible w/ prior approval • Look for a group on Piazza • 2 Project options • HEVC Intraframe Prediction Competition • A topic from your own research
Competition • Intraframe prediction for HEVC video encoder • http://x265.org/ • Fixed task, groups compete to see who can build the fastest implementation • Evaluation metric will be a weighted mixture of PCIe I/O time and total time • Winning team gets iPads • Sponsored by MultiCoreWare
Intra(frame) Prediction • Part of H.265 (aka HEVC) video format • Successor to H.264, most popular current format • Achieves higher PSNR with lower bitrate by using more computationally expensive methods • Idea: Real video frames exhibit structure • A pixel’s color can be predicted from the color of its neighbors within the same frame (intraframe) or from recent frames (interframe) • Encode a block of pixels as a prediction mode + a residualor deltafrom that prediction • Should be smaller than coding pixel values directly (compression)
HEVC Intra Prediction Modes • Frames are processed in 4x4 – 64x64 blocks of pixels in (mostly) top-left to bottom-right order • We can use the (previously processed) upper and left neighboring pixels to estimate (predict) the current block of pixels • Video consists of 1 luma and 2 chroma channels (YCC colorspace) • 4:2:0 subsampling means luma is at 2x the x and y resolution • Prediction is done separately for all 3 channels • Three patterns that are seen a lot in video are flat regions, smooth gradients, and straight edges • We can predict a block of pixels as: • The average of its neighbors (DC) • A smooth gradient based on its neighbors (Planar) • A linear extension of its neighbors in one of 33 directions (Angular) • 35 total modes (up from 8 in H.264, DC + 8 Angular)
DC Mode • Predict that all pixels in the block are the average of the edge pixels of top and left neighbor blocks • Good at compressing flat regions (one color) Don’t Care Top Neighbor Don’t Care Current Block Left Neighbor
Planar Mode • Predict that the block forms a smooth gradient defined by its top and left neighbors • Computed by average of two linear interpolation (less expensive than bilinear) • Good at compressing smoothly varying regions Don’t Care Top Neighbor Don’t Care Current Block Left Neighbor
Angular Modes • 33 directions • More coverage close to horizontal and vertical • Those directions are more common in real video
Angular Modes • Extend neighbor pixels into current block at specific angle • Good at compressing areas with straight edges • Often need to linearly interpolate between 2 neighbor pixels • Formulated such that it can be done in integer arithmetic
Angular Modes H.264 HEVC 11% Lower Bitrate
SATD • Sum of Absolute Differences (SAD) is a simple way of measuring the disparity between two blocks of pixels • Sum of Absolute Transformed Distances (SATD) does a Hadamard transform on the differences before summing • More computationally complex • Correlates better with subjective and objective (PSNR) metrics • SATD on an 8x8 block is commonly called SA8D
Your Task • For 4x4, 8x8, 16x16, 32x32, 64x64 pred. blocks: • Assume the entire frame is a regular grid • For each luma and chroma block: • For each of the 35 prediction modes: • Use reference pixels directly for neighbors (no reconstruction) • Compute predicted pixel values • Compute SATD between prediction and reference pixels • Return list of <mode, SATD> tuples sorted by SATD (best to worst prediction) • Your kernel may operate on one or multiple frames
Infrastructure • We will provide a code skeleton and test harness, as with the labs • We will link to resources with high-level and low-level explanations of intra prediction • The existing serial and vectorized x265 code is also a good reference • Your code should compile cleanly and run on the GEM cluster’s C2050s • We mayget a newer (Kepler) evaluation machine
Evaluation • We will measure total prediction time and time for memcpy()s to and from the GPU • Final metric will be a weighted average of total time and I/O time (exact weights TBA) • Each member of the winning team by this metric will receive an iPad
Additional Challenge • Two related challenges not counted towards the competition and course grade are also available: • DCT Primitives • Loop Filters • Teams can win iPads if for one of these two challenges if they: • Meet performance standards (TBA) • Perform better than any other team • Meet code quality standards • Contribute code to open source repository
DCT Primitives • List of Primitives: • Discrete Cosine Transform • Quantization • Dequantization • Inverse Discrete Cosine Transform
Loop Filters • Deblocking Filter: • Block coding results in sharp edges in image Courtesy of wikipedia.org
Loop Filters • Deblocking Filter: • Block coding results in sharp edges in image • DBF removes edges between blocks Courtesy of wikipedia.org
Loop Filters • Deblocking Filter: • Block coding results in sharp edges in image • DBF removes edges between blocks • Sample Adaptive Offset (SAO) Filter: • Reconstruct original amplitudes using offsets • Band filter: categorize samples into 32 bands • Edge filter: add offsets depending on neighbors
Infrastructure • Infrastructure similar to competition will be provided • Less support than competition
Dates • November 31: Project Proposals due • Only for students not doing the competition • Oral in class (5 slides / 10 min) • Week of November 18: Progress Reports • Appointment with course staff (15 min) • December 16: Final Project Presentations • December 18: Final Project Report due