730 likes | 1.04k Views
6 . Graphics Adapters. Structure of a Graphics Adapter Color Representation Video Memory Graphics Accelerators 3D Accelerators Graphics Processing Units Digital Interfaces for Monitors . Graphics Processing Units. Graphics Processing Units Overview GPGPU Computing
E N D
6. Graphics Adapters • Structure of a Graphics Adapter • Color Representation • Video Memory • Graphics Accelerators • 3D Accelerators • Graphics Processing Units • Digital Interfaces for Monitors Input/Output Systems and Peripheral Devices (06-2)
Graphics Processing Units • Graphics Processing Units • Overview • GPGPU Computing • The CUDA Architecture • The Fermi GPU Architecture Input/Output Systems and Peripheral Devices (06-2)
Overview (1) • GPU – GraphicsProcessing Unit • Dedicated graphics processors for PCs, workstations, and game consoles • Initially used to accelerate the rendering stage for 3D graphics (e.g., texture mapping) • Later also used to accelerate the geometric computations (rotation, translation) • GPUs contain shader units, modules for texture mapping, anti-aliasing etc. Input/Output Systems and Peripheral Devices (06-2)
Overview (2) • Vertexshader units • Transform the 3D position of each vertex to the 2D coordinates on the screen and to the depth value for the z-buffer • Modify the attributes of vertices: position, color, texture coordinates • Geometryshader units • Generate geometric figures or add volumetric details to objects Input/Output Systems and Peripheral Devices (06-2)
Overview (3) • Pixel/fragmentshader units • Determine the color, z depth, and alpha value for each pixel or fragment • Unifiedshader units • Programmable units • Able to perform various shading operations (vertex, geometry, pixel) • GPUs contain an array of computing units and a unit that distributes the operations to be performed Input/Output Systems and Peripheral Devices (06-2)
Overview (4) • The architecture with programmable units allows a more flexible use of the hardware resources • The programmable units can also be used for other types of computations • A flexible parallel architecture is obtained • GPUs also include modules for 2D acceleration, MPEG compression, high-definition video decoding Input/Output Systems and Peripheral Devices (06-2)
Overview (5) • GPUs can be dedicated or integrated • Dedicated GPUs • Used in graphics cards interfaced with the motherboard via a PCI Express bus or AGP (Accelerated Graphics Port) interface • Have a dedicated memory to the card use • Examples • AMD Radeon HD 8xxx (e.g., 8970) • NVIDIA GeForceGTX 7xx (e.g., 780) Input/Output Systems and Peripheral Devices (06-2)
Overview (6) • Integrated GPUs • Are integrated into a chipset or processor • Use a portion of the system memory • Have lower performance compared to dedicated GPUs • Examples • Intel HD Graphics (e.g., HD Graphics 4600) • AMD Radeon HD 8xxx in APU (Accelerated Processing Unit) processors • NVIDIA in Tegra 4 and Tegra 4i processors Input/Output Systems and Peripheral Devices (06-2)
Overview (7) • The design of GPUs was influenced by the 2D and 3D programming interfaces • Implement API functions in hardware • OpenGL (Open Graphics Library) • For various platforms and languages • Functions to draw 3D scenes from primitives • Direct3D (component of DirectX) • Only for the Microsoft operating systems • Low-level interface to the 3D hardware functions Input/Output Systems and Peripheral Devices (06-2)
Overview (8) • Technologies for connecting multiple GPUs on different graphics cards • NVIDIA: SLI (Scalable Link Interface) • 2 .. 4 identical graphics cards are connected via a motherboard (PCIe x 16) • AMD: CrossFireX • Up to 4 graphics cards can be connected • The graphics cards do not have to be identical • The cards have external connectors Input/Output Systems and Peripheral Devices (06-2)
Graphics Processing Units • Graphics Processing Units • Overview • GPGPU Computing • The CUDA Architecture • The Fermi GPU Architecture Input/Output Systems and Peripheral Devices (06-2)
GPGPU Computing (1) • GPGPU (General Purpose computing on GPU) • The GPU processing cores provide massive FP computational power • Example: a single NVIDIA Tesla K40 GPU (2,880 cores) achieves 4.29 TFLOPS • The graphics pipeline can also be used for general-purpose applications • The performance can be orders of magnitude higher than that of conventional CPUs Input/Output Systems and Peripheral Devices (06-2)
GPGPU Computing (2) • GPUs can process independent vertices and pixels/fragments stream processors • Stream: set of records that require similar computation • Kernel function: applied to each element in the stream • Shared memories cannot be used • Ideal GPGPU applications: large data sets, high parallelism, reduced dependencies Input/Output Systems and Peripheral Devices (06-2)
GPGPU Computing (3) • Disadvantages of GPGPU computing: • The programmer needs to be familiar with the graphics APIs and the GPU architecture • Problems need to be expressed in terms of coordinates, textures, shader functions • The need to use graphics programming languages: OpenGL, DirectX, Cg • API extensions for running some program functions on GPU's processors: CUDA (NVIDIA), OpenCL(Khronos Group) Input/Output Systems and Peripheral Devices (06-2)
Graphics Processing Units • Graphics Processing Units • Overview • GPGPU Computing • The CUDA Architecture • The Fermi GPU Architecture Input/Output Systems and Peripheral Devices (06-2)
The CUDA Architecture (1) • CUDA(Compute Unified Device Architecture) • Software and hardware architecture • Enables GPUs to execute programs written in C, C++, Fortran, OpenCL languages • Allows to use Microsoft's DirectCompute API • Allows to access directly the GPU resources for general-purpose computing • Exploits the GPU's capability to operate on large matrices in parallel Input/Output Systems and Peripheral Devices (06-2)
The CUDA Architecture (2) • A CUDA program calls kernel functions executed by threads • Threads are organized into blocks and groups of blocks (grids) • Thread block: • Set of concurrent threads • Communicate via a shared memory • Each thread has an identifier, registers, private memory, inputs, outputs Input/Output Systems and Peripheral Devices (06-2)
The CUDA Architecture (3) • Grid of blocks: • Group (array) of thread blocks • The blocks execute the same kernel function • Ensure synchronization between dependent kernel functions • Results are shared in a global memory allocated to an application global synchronization Input/Output Systems and Peripheral Devices (06-2)
The CUDA Architecture (4) Input/Output Systems and Peripheral Devices (06-2)
The CUDA Architecture (5) • The hierarchy of threads is executed on a hierarchy of processors on the GPU • Threads: executed by CUDA cores and other execution units • Thread blocks: executed by a streaming multiprocessor (SM) • Group of 32 threads: warp • Grids of blocks: executed by the GPU Input/Output Systems and Peripheral Devices (06-2)
Graphics Processing Units • Graphics Processing Units • Overview • GPGPU Computing • The CUDA Architecture • The Fermi GPU Architecture Input/Output Systems and Peripheral Devices (06-2)
The Fermi GPU Architecture (1) • Used by NVIDIA's graphics processing units • GeForce 400, 500, 600 series: for desktop computers • Quadro 4000, 5000, 6000 series: for workstations • Tesla C2050, C2070, C2075 series: for high-performance computers • Tesla S2050, M2050, M2070, M2090 series: for supercomputers Input/Output Systems and Peripheral Devices (06-2)
The Fermi GPU Architecture (2) • Contains up to 512 CUDA cores • 16 streaming multiprocessors x 32 cores • Each core executes one integer or floating-point instruction per clock cycle • Six 64-bit memory partitions • 384-bit interface • Up to 6 GB of GDDR5 DRAM memory • PCI Express interface to the CPU • Global scheduler GigaThread Input/Output Systems and Peripheral Devices (06-2)
The Fermi GPU Architecture (3) Input/Output Systems and Peripheral Devices (06-2)
The Fermi GPU Architecture (4) • Each CUDA core contains: • Integer arithmetic and logic unit • Floating-point unit IEEE 754-2008 • Fused multiply-add instruction more accurate than performing the operations separately • Can perform double-precision operations • Each SM contains 16 Load/Store units • Each SM contains 4 special-function units (SFUs) transcendental functions Input/Output Systems and Peripheral Devices (06-2)
The Fermi GPU Architecture (5) Input/Output Systems and Peripheral Devices (06-2)
6. Graphics Adapters • Structure of a Graphics Adapter • Color Representation • Video Memory • Graphics Accelerators • 3D Accelerators • Graphics Processing Units • Digital Interfaces for Monitors Input/Output Systems and Peripheral Devices (06-2)
Digital Interfaces for Monitors • Digital Interfaces for Monitors • DVI • HDMI • DisplayPort Input/Output Systems and Peripheral Devices (06-2)
DVI (1) • DVI – Digital Visual Interface • Developed by DDWG (Digital Display Working Group) • Intended for liquid crystal monitors and digital projectors • Based on the PanelLink technology of Silicon Image serial interface for uncompressed digital video data • Partially compatible with HDMI (digital mode) and VGA(analog mode) interfaces Input/Output Systems and Peripheral Devices (06-2)
DVI (2) • Contains signals for a DDC (Display Data Channel) between the monitor and computer • Implemented with the ACCESS.bus serial bus (based on I2C) • DDC2 provides bidirectional communication between the monitor and computer • Allows for automatic system configuration • The format of the configuration data is defined by the EDID (Extended Display Identification Data) standard EDID EPROM Input/Output Systems and Peripheral Devices (06-2)
DVI (3) • The TMDS Protocol • Transition Minimized Differential Signaling • Developed by Silicon Image • Differential signaling is used • Minimizes the number of transitions for the signals from 1 to 0 and conversely 8b/10b encoding • ATMDS link consists of a TMDS transmitter and a TMDS receiver Input/Output Systems and Peripheral Devices (06-2)
DVI (4) Input/Output Systems and Peripheral Devices (06-2)
DVI (5) • Contains three identical encoders • The inputs of each encoder are 8 bits for pixel data and 2 control bits • In each clock cycle, the encoder generates a 10-bit character: • From the 8 data bits, or • From the 2 control bits • The output of each encoder is a continuous stream of serialized TMDS characters Input/Output Systems and Peripheral Devices (06-2)
DVI (6) • Maximum pixelclock frequency: 165 MHz • The binary data rate of a TMDSchannel: 10 x pixel clock frequency • For a TMDS link: 3 x 1.65 = 4.95 Gbits/s • Maximum pixel rate: 165 megapixels/s 2.75 megapixels/frame at 60 Hz • Maximum resolution: 19201440 (4:3) or 20481152 (16:9) at 60 Hz • Increasing the resolution:dual TMDS link • The connector contains pins for two links Input/Output Systems and Peripheral Devices (06-2)
DVI (7) Maximum resolutions supported by DVI Input/Output Systems and Peripheral Devices (06-2)
DVI (8) • Types of connectors • DVI-I (DVI-Integrated): contains the digital signals for a single- or dual-link and the analog signals (a) • DVI-D (DVI-Digital-only): contains only the digital signals (b) Input/Output Systems and Peripheral Devices (06-2)
Digital Interfaces for Monitors • Digital Interfaces for Monitors • DVI • HDMI • DisplayPort Input/Output Systems and Peripheral Devices (06-2)
HDMI (1) • HDMI – High-Definition Multimedia Interface • Audio/video interface for uncompressed digital data • For connecting A/V sources to computer monitors, digital TVs, digital audio devices • Enables to send on a single cable: • Various TV and PC videoformats • Up to 8 digital audiodata streams • Auxiliary data and control information Input/Output Systems and Peripheral Devices (06-2)
HDMI (2) • Uses the TMDS protocol • HDMI signals are electrically compatible with the DVI signals passive adapter • Video period: for the pixels of an active video line (8b/10b); includes horizontal and vertical blanking intervals • Data period: for audio and auxiliary data packets (4b/10b) audio mute, color depth, color space • Control period: between video and data periods Input/Output Systems and Peripheral Devices (06-2)
HDMI (3) • Version 1.0 (2002) • Maximum bandwidth of 4.95 Gbits/s (165 MHz) resolution of 19201200 (WUXGA) at 60 Hz • Version 1.1 (2004) • Supports the DVD Audio format • Version 1.2 (2005) • Supports the SACD (Super AudioCD) format • Allows PC applications to only support the RGB color space Input/Output Systems and Peripheral Devices (06-2)
HDMI (4) • Version 1.3 (2006) • Bandwidth of 10.2 Gbits/s (340 MHz) resolution of 25601600 (WQXGA) at 60 Hz • Supports video images with more colors: 30, 36, or 48 bits/pixel (Deep Color, optional) • Supports the Dolby TrueHD and DTS-HD Master Audio formats (optional) • Two types of cables: • Category 1: up to 74.25 MHz (720p or 1080i) • Category 2: up to 340 MHz (1080p or more) • A smaller connector: type C Input/Output Systems and Peripheral Devices (06-2)
HDMI (5) • Version 1.4 (2009) • The same bandwidth • Resolutions of 4K2K: 38402160p (Quad HD) at 24, 25, or 30 Hz; 40962160p at 24 Hz • HDMI Ethernet channel (100 Mbits/s) • Audio return channel (ARC) • Stereoscopic 3D formats • Micro HDMI connector (type D) • Automotive connection system Input/Output Systems and Peripheral Devices (06-2)
HDMI (6) • Version 1.4a (2010) • Specifies two new mandatory 3D formats • Version 1.4b (2011) • Support for resolution of 19201080p, 120 Hz • The HDMI Forum (www.hdmiforum.org) has been created in 2011 • Version 2.0 (2013) • Bandwidth has increased to 18 Gbits/s 4K2K resolutions at 60 Hz Input/Output Systems and Peripheral Devices (06-2)
HDMI (7) • HDMI connections • Single-link: pixel rate of 25 MHz .. 340 MHz • Dual-link: pixel rate of 25 MHz .. 680 MHz • Audio formats • Uncompressed audio: PCM (Pulse Code Modul.) • Sampling rates: 32; 44.1; 48; 96; 192 KHz • Sample sizes: 16, 20, or 24 bits • Compressed audio: Dolby Digital, DTS • Lossless compressed audio: Dolby TrueHD, DTS-HD Master Audio Input/Output Systems and Peripheral Devices (06-2)
HDMI (8) • Video formats • Color spaces: RGB, YCbCr, xvYCC (optional) • YCbCr: Y luminance and synchronization; Cb and Crchroma (Cb =B Y, Cr =R Y) • xvYCC: chroma values may correspond to negative RGB values more saturated colors • Deep Color option: 10 bits, 12 bits, or 16 bits per color component • 12 bits per color component: 68.7 billion colors Input/Output Systems and Peripheral Devices (06-2)
HDMI (9) • CEC (Consumer Electronics Control) • One-wire bidirectional serial bus used to transfer remote control commands • One Touch Play, System Standby, Tuner Control • The user can control several devices connected through HDMI with a single remote control • Devices can command each other without user intervention • Alternative names: Anynet+ (Samsung), BRAVIA Link (Sony), EasyLink (Philips) Input/Output Systems and Peripheral Devices (06-2)
HDMI (10) • Connectors • Type A: 19 pins, single-link connection • Type B: 29 pins, dual-link connection • Type C: mini-connector, 19 pins; can be connected to a Type A connector • Type D: micro-connector, 19 pins (similar to micro-USB) • Type E: for automobiles Input/Output Systems and Peripheral Devices (06-2)
Digital Interfaces for Monitors • Digital Interfaces for Monitors • DVI • HDMI • DisplayPort Input/Output Systems and Peripheral Devices (06-2)
DisplayPort • DisplayPort • Overview • DisplayPort Architecture • Embedded DisplayPort (eDP) Input/Output Systems and Peripheral Devices (06-2)
Overview (1) • Developed by VESA (Video Electronics Standards Association) • Intended to replace the DVI and VGA interfaces, and the LVDS (Low-Voltage Differential Signaling) protocol • DisplayPort and HDMI interfaces may coexist in consumer electronics devices • Versions of DisplayPort specifications • Version 1.0: published in 2006 • Versions 1.1 and 1.1a: published in 2007 • Version 1.2: published in 2009 Input/Output Systems and Peripheral Devices (06-2)