1 / 62

Cg: A system programming graphics hardware in a C-like language

Cg: A system programming graphics hardware in a C-like language. William R. Mark The University of Texas at Austin R. Steven Glanville NVIDIA Corporation Kurt Akeley NVIDIA Corporation Mark J. Kilgard NVIDIA Corporation.

kimo
Download Presentation

Cg: A system programming graphics hardware in a C-like language

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cg: A system programming graphics hardware in a C-like language William R. MarkThe University of Texas at AustinR. Steven Glanville NVIDIA CorporationKurt Akeley NVIDIA CorporationMark J. Kilgard NVIDIA Corporation Siggraph 2003

  2. Cg’s Model of GPU [Cg Toolkit]

  3. The Graphics Pipeline [Programming Graphics Hardware]

  4. Introduction Background Design Goals Key Design Decisions Cg Language Summary Design Issues CgFX System Experiences Conclusion Outline

  5. Introduction • Graphics architectures are now highly programmable, and support application-specified assembly programs for both vertex processing and fragment processing • Most effective tool for programming these architectures is a high level language • program portability, improved programmer productivity, easier develop programs incrementally and interactively • particularly valuable for shader programs

  6. Introduction • A system for programming graphics hardware that supports programs written in a new C-like language named Cg

  7. Introduction Background Design Goals Key Design Decisions Cg Language Summary Design Issues CgFX System Experiences Conclusion Outline

  8. IRIS GL(SGI, 1982) RenderMan(Pixar, 1988) OpenGL(ARB, 1992) PixelFlow Shading Language (UNC, 1998) Reality Lab(RenderMorphics, 1994) Real-Time Shading Language (Stanford, 2001) Direct3D(Microsoft, 1995) The Evolution of GPU Programming Language C(AT&T, 1970s) C++(AT&T, 1970s) Java(Sun, 1970s) HLSL(Microsoft, 2002) Cg(NVIDIA, 2002) GLSL(ARB, 2003) [NVIDIA]

  9. Background • In real-time rendering systems, support for user programmability has evolved with the underlying graphics hardware • For many years, mainstream commercial graphics hardware was configurable , but not user programmable • multipass rendering techniques: SGI’s OpenGL shader system [2000] and Quake III’s shading language [1999]

  10. Background • In response to this trend, graphics architects began to incorporate programmable processors into both the vertex-processing and fragment-processing stages of single-chip graphics architectures [2001] • The most recent generation of PC graphics hardware (DirectX 9 or DX9 hardware [2002]), continues the trend of adding programmable functionality to both the fragment and the vertex processors

  11. DX9-class Architectures • Vertex processor • adds conditional branching functionality • Fragment processor • adds flexible support for floating-point arithmetic and computed texture coordinates

  12. Introduction Background Design Goals Key Design Decisions Cg Language Summary Design Issues CgFX System Experiences Conclusion Outline

  13. Design Goals • Ease of programming • programming in AL is slow and painful • easy reuse of code • Portability • hardware from different companies • hardware generations (DX8-class hardware or better) • operating systems (Windows, Linux, MacOS) • major 3D APIs (OpenGL, DirectX)

  14. Design Goals • Complete support for hardware functionality • Performance • Minimal interference with application data • Ease of adoption • Extensibility for future hardware • Support for non-shading uses of GPU • (some of these goals are in partial conflict with each other)

  15. Introduction Background Design Goals Key Design Decisions Cg Language Summary Design Issues CgFX System Experiences Conclusion Outline

  16. Key Design Decisions • A “general-purpose language”,not a domain-specific “shading language" • A program for each pipeline stage • Permit subsetting of language • Modular system architecture

  17. Domain-specific vs. General-purpose Language • Domain-specific languages • shading computation • General-purpose languages • expose the fundamental capabilities of programmable graphics architectures

  18. “General-purpose Language" • When considered with our design goals,let us to develop a hardware focused general-purpose language • high performance • minimal management of application data • support for non-shading uses of GPU’s

  19. Cg follows C's philosophy • C language in achieving goals for performance, portability, and generality of CPU programs that were very similar to our goals for a GPU language • Extend and modify C to support GPU architectures effectively → Cg • language follows syntax and philosophy of C • reserves all C and C++ keywords • selectively uses ideas from C++, Java, RenderMan, RTSL • 3Dlabs, OpenGL ARB(GLSL), Microsoft (HLSL)

  20. Programming Model • Choosing a programming modelto layer on top of the stream-processing architecture • RTSL, RenderMan: single program • OpenGL, Direct3D: two separate programs • the programs consume an element of data from one stream, and write an element of data to another stream • Single-program model is not a natural match for the underlying dual-processor architecture

  21. Vertex Program Executed Once Per Vertex Fragment Program Executed Once Per Fragment A Program for Each Pipeline Stage The user-programmable processors in today's graphics architectures use a stream-processing model [Programming Graphics Hardware]

  22. A language for Expressing Stream Kernels • A single language specification for writing a stream kernel (i.e. vertex program or fragment program) • simplify and generalize the language by eliminating most of the distinctions between vertex / fragment programs • And then allowed particular processors to omit support for some capabilities of the language • e.g. use of texture lookuptoday’s vertex processor don’t support texture lookups

  23. A language for Expressing Stream Kernels • Current Cg system can be thought as a specialized stream processing system • Cg system relies on the established graphics pipeline dataflow of GPUs • not connect stream processing kernels together • Cg’s focus on kernel programming • specialized for stream-kernel programming • could be extended to support other parallel programming models

  24. A Data-flow Interface for Program Inputs and Outputs • Should the system allow any vertex program communicates with any fragment program ? • via the rasterizer / interpolator • How should the vertex program outputs and fragment program inputs be defined to ensure compatibility ?

  25. A Data-flow Interface for Program Inputs and Outputs • When programming GPUs at the assembly level • the interface between fragment programs and vertex programs is established at the register level • For example: user can establish a conventionTEXCOORD3 I/O register • The binding names must be chosen from a predefined namespace with predefined data types

  26. A Data-flow Interface for Program Inputs and Outputs • Cg and HLSL: modified bind-by-name scheme • a predefined namespace is used instead of the user-defined identifier name • provide maximum control over the generated code • Cg also supports a bind-by-position • requires that data be organized in an ordered list • a function-parameter list or a list of structure members • GLSL: purebind-by-name • not supported by either Cg or HLSL

  27. Permit Subsetting of Language • Conflict goals: portability and comprehensive • Major differences in functionality between the different graphics architecture that Cg supports • e.g. DX9: floating-point fragment arithmetic • Consider a variety of possible approaches to hiding or exposing these difference • minor architectural differences could be efficiently hidden by the compiler, Cg did so • major architectural differences can not be hidden by a compiler → Performance

  28. Permit Subsetting of Language • Cg wanted both support • the existing installed base of DX8-class hardware • to provide access to the capabilities of the latest hardware • Cg: • expose major architectural differences asdifferences in language capabilities • to minimize the impact on portability, Cg exposed the differences using a subsetting mechanism • each processor is defined by aprofile • specifies which subset of the full Cg specification is supported on that processor

  29. Modular System Architecture

  30. No Mandatory Virtualization • Whether or not to automatically virtualizehardware resources using software-based multi-pass techniques ? • Do not require it in the Cg language specification (not support in the current release of Cg) • effective virtualization of this hardware is impossible • too slowly to be useful in a real-time application • conflicted with our design goals(virtualization on current hardware requires global management of application data and hardware resources)

  31. Layered Above An Assembly Language Interface • Whether or not to expose machine / assembly language as an additional interface for system users ? • By providing access to the assembly code, the system allows users • tune their code by studying the compiler output • manually editing the compiler output • even write programs entirely in assembly language • maximize performance

  32. Explicit Program Parameters • All input parameter to a Cg program • be explicitly declared using non-static global variables • by including the parameters on the entry function’s parameter list • Cg also provides a set of runtime API routines that allow parameters to be passed using their true names and types

  33. Explicit Program Parameters • The Cg compiler prepends a header to its assembly code output to describe the mapping betweenprogram parameter and registers #profile arbvp1 #program simpleTransform #semantic simpleTransform.brightness #semantic simpleTransform.modelViewProjection #var float4 objectPosition : $vin.POSITION : POSITION : 0 : 1 #var float color : $vin.COLOR : COLOR : 1 : 1 …. #var float brightness :: c[0] : 8 : 1 #var float4x4 modelViewProjection :: c[1], 4 : 9 : 1

  34. Introduction Background Design Goals Key Design Decisions Cg Language Summary Design Issues CgFX System Experiences Conclusion Outline

  35. vector of four float Example Program • Example Cg Program for Vertex Processor void simpleTransform(float4 objectPosition : POSITION, float4 color : COLOR, float4 decalCoord : TEXCOORD0, out float4 clipPosition : POSITION, out float4 oColor : Color, out float4 oDecalCoord : TEXCOORD0, uniform float brightness, uniformfloat4x4 modelViewProjection) { clipPositon = mul(modelViewProjection, objectPosition); oColor = brightness * color; oDecalCoord = decalCoord; }

  36. Other Cg Functionality • Provides structure, arrays, (+, *, /, etc.), boolean type and (||, &&, !, etc.), (++/--), (?:), (+=, etc.) • Supports programmer-defined functions(recursive functions are not allowed) • Provides only a subset of C’s control flow construct:(do, while, for, if, break, continue) (goto, switch) are not supported • Doesn’s support pointers or bitwise operations • Supports #include, #define, #ifdef, etc. (matching the C preprocessor)

  37. Introduction Background Design Goals Key Design Decisions Cg Language Summary Design Issues CgFX System Experiences Conclusion Outline

  38. Design Issues • Support for hardware • User-defined interfaces between modules • Other language design decisions • Runtime API

  39. Support for Hardware • The discussion below is organized around the characteristics of GPU hardware • Stream processor • Data types • Indirect addressing • Interaction with the rest of the graphics pipeline • Shading-specific hardware functionality

  40. Stream Processor • A GPU program is executed many times –once for each vertex or fragment • efficiently: input → changes vs. unchanged(reside in different register sets) • A GPU language compiler must know the category to which an input belongs before it can generate assembly code

  41. Stream Processor • Terminology for the two kind of input • varying input • uniform input • Cg uses the uniform qualifier • Computation that depend only on uniform parameter • do not need to be redone for every vertex or fragment

  42. Data Type • Multiple numeric data types • float(32-bit), half(16-bit), fixed(12-bit) • Vector data types and operators • Matrix data types and operations • Not support integer data types • Add a bool data type for conditional operation

  43. Indirect Addressing • Current graphics processors have very limited indirect addressing capability (uniform, sampler) • An array assignment in Cg performs a copy of the entire array • Cg currently forbids the use of pointer • Cg currently forbids recursive function calls • Support call-by-value-result semantics • using a notation (in and out parameter modifier)

  44. Interaction with the Rest of the Graphics Pipeline • Some of the I/O register are used to control the non-programmable parts of the graphics pipeline, rather than to pass general-purpose data • The Cg specification mandates that certain register identifiers(e.g. POSITION) be supported as an output by all vertex profiles, and that certain other identifiers be supported by all fragment profiles

  45. Shading-specific Hardware Functionality • The least generation of graphics hardware include a variety of capabilities specialized for shading • Chose to expose the latest generation of graphics hardware capability via Cg’s standard library functions • maintains the general-purpose nature of the language • Cg standard library supports a variety of mathematical, geometric, and specialized functions

  46. User-defined Interface Between Modules • The general-purpose solution we chose is adopted from Java and C# • Programmer may define an interface, which specifies one or more function prototypes • Programmer implements the interface by defining a struct (i.e. class) that contains definition for the interface’s function

  47. Other Language Design Decisions • Function overloading by types and by profiles • Constants are typeless • No type checking for textures

  48. Function Overloading by Types and by Profile • Support function overloading by data type • mechanism is similar to C++ (less complex) • Also permit overloaded by profile • it is possible to write multiple versions of a function that are optimized for different architecture • the compiler will automaticallychose the version for the current profile

  49. Overloading • Function overloading by hardware profile // For ps_1_1 profile, use cubemap to normalize ps_1_1 float3 mynomalize(float3 v) { return texCUBE(norm_cubmap, v.xyz).xyz; } //For ps_2_0 profile, use stdlib routine to normalize ps_2_0float3 mynormalize(float3 v) { return normalize(v); }

  50. Constants are Typeless • Change the type promotion rulesfor constants • C: float x; 2.0*x → double precision • Cg: half y; 2.0*y → half precision • Internally, the new constant promotion rules are implemented by assigning a different type (cfloat or cint) to constants that do not have an explicit type suffix

More Related