1 / 11

Multimedia

Multimedia. New Architecture Direction. “… media processing will become the dominant force in computer architecture and microprocessor design”

ron
Download Presentation

Multimedia

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multimedia

  2. New Architecture Direction • “… media processing will become the dominant force in computer architecture and microprocessor design” • “… new media-rich applications … involve significant real-time processing of continuous media streams and make heavy use of vectors of packed 8-, 16-, and 32-bit integer and f.p.” • “How Multimedia Workloads will Change Processor Design,” Diefendorff & Dubey, IEEE Computer (9/97) • Needs includes high memory bandwidth, high network bandwidth, continuous media data types, real-time response, fine-grain parallelism • Also significant focus on system bus performance • Common bridge to the memory system and I/O • Critical performance component for SMP server platforms

  3. Multimedia Workloads • Multimedia • Video conferencing • Video authoring • Animation • Games • Algorithms • Image compression (jpeg) • Video Compression (mpeg) • 3-D graphics • encryption

  4. Multimedia Characteristics • Real-time response • Video, audio • Continuous media data types • 8-16 bits sufficient for many applications • Data parallelism • E.g. share same operation to whole image • Vector or SIMD work well here • Coarse-grained parallelism • E.g. video encoding/decoding, audio encoding/decoding • Small loops • Most time spent in kernal • Amenable to hand-optimization • High memory bandwidth • Video, 3d graphics • Caches not large enough

  5. Multimedia ISA Extensions • HP PA-RISC • MAX-2 • SUN SPARC • VIS • Intel x86 • MMX • MIPS • MDMX • PowerPC • Altivec

  6. MMX • “MMX Technology Extension to the Intel Architecture” Alex Peleg and Uri Weiser, IEEE Micro, August 1996 • Goals • Improve performance of multimedia applications • Graphics, MPEG video • Image processing, speech recognition • Remain completely compatible with Intel x86 ISA • Minimize cost • Approach • Use packed data types • Exploit SIMD parallelism • Make use of existing wide data paths

  7. Data Types and Operands • Three fixed-point integer types packed into 64 bit quad word • Packed Byte: 8 8-bit bytes • Packed Word: 4 16-bit words • Packed Doubleword: 2 32-bit words • User-controlled fixed point • Eight 64-bit GP registers (mm0-mm7) • MMX shares FPU • Can’t do FP an MMX at the same time • Random Access • Learned lesson from FP unit design.

  8. MMX Operations • 57 MMX instructions work on all data types • Support for saturation arithmetic • Simplifies handling of underflow and overflow • Matches physical behavior • Packed operations • Addition/subtraction, multiplication, compares, shifts • Conversion operations • Pack/unpack • Performance improvement • Fewer loads and stores • Fewer arithmetic operations, but more conversion

  9. MMX Operations A3 A2 A1 A0 Packed multiply-add To doubleword X X X X B3 B2 B1 B0 A3 X B3 A2 X B2 A1 X B1 A0 X B0 A3XB3 + A2XB2 A3XB3 + A2XB2 51 3 5 23 > > > > Packed compare Greater-than word 73 2 5 6 00…0 11…1 00…0 11…1

  10. Using MMX • Assembly language coding • Use of libraries • E.g. IDCT, DCT, matrix multiply… • Use of C macros (“intrinsics”) • Generate optimized assembly code • Performs register allocation and instruction scheduling • MMX64 t0, t1; t0 = padd(t0, t1); • Requires intimate knowledge of MMX • Could a compiler generate MMX code?

  11. Chroma Keying • Weatherman example • For (I = 0; I < imagesize; I++) new_image = (x[I] == blue) ? Y[I] : X[I]; • Movq mm3, mem1 ; load 8 pixels from weathermanmovq mm4, mem2 ; load 8 pixels from mapPcmpeq mm1, mm3 ; generate select mask pand mm4, mm1 ; AND map with maskpandn mm1, mm3 ; AND weatherman with inverse maskpor mm4, mm1 ; OR masked images together

More Related