150 likes | 283 Views
Async Workgroup Update. Barthold Lichtenbelt. Goals. Provide synchronization framework for OpenGL Provide base functionality as defined in NV_fence and GL2_async_core Build a framework for future, more complex, functionality, some of which discussed in GL2_async_core
E N D
Async Workgroup Update Barthold Lichtenbelt
Goals • Provide synchronization framework for OpenGL • Provide base functionality as defined in NV_fence and GL2_async_core • Build a framework for future, more complex, functionality, some of which discussed in GL2_async_core • Initially support CPU <-> GPU synchronization • Support synchronization across multiple OpenGL contexts • Resulted in GL_ARB_sync spec • Finished April 2006 • Posted draft to opengl.org for feedback • Not quite official ARB extension yet
Functionality overview • ARB_sync provides synchronization primitives • Can be tested, set and waited upon • Specifically, a “Fence Synchronization Object” and corresponding Fence command • Fence completion allows for partial glFinish • All commands prior to the fence are forced to complete before control is returned to caller • Fence Sync Objects can be shared across contexts • Allows for synchronization of OpenGL command streams across contexts • New data type: GLtime represents intervals in nanoseconds • 64 bit integer, same encoding as UST counter in OpenML • Accuracy implementation dependent, precision in nanoseconds If you have used the Windows Event model, this will feel familiar
Synchronization model in ARB_sync 1/2 • A “sync object” is a primitive used for synchronization between CPU and GPU, CPU, or ‘something else’. • Sync object has state: type, condition, status • A sync object’s status can be signaled or non-signaled • when created status is signaled unless a flag is set in which case it is non-signaled • A “fence sync object” is a specific type of sync object • Provides partial finish semantics • Only type of sync object currently defined • A “fence” is a token inserted in the GL command stream • A sync object is not inserted into the command stream • Fence has no state • A fence is associated with a fence sync object. • Multiple fences can be associated with the same sync object • When a fence is inserted in the command stream, the status of its sync object is set to non-signaled • A fence, once completed, will set the status of its sync object to signaled
Synchronization model in ARB_sync 2/2 • A wait function waits on a sync object, not on a fence • A poll function polls a sync object, not a fence • A wait function called on a sync object in the non-signaled state will block. It unblocks when the sync object transitions to the signaled state.
Context A Sync_objectA = glCreateSync(attrib); <render to texture that context B needs> glFence(sync_objectA); glFlush(); // prevent deadlock Context B glClientWaitSync(sync_objectA,0,GL_FOREVER); glBindTexture(….); // Just rendered <render using texture> Example – RTT with two contexts
OS specific functionality • Convert sync object to the window system native event primitive • Allows applications to synchronize all events in a system using one API • All operations on <sync> are reflected in OS event and vice-versa • Both <sync> and the OS event are valid to use in your code • On windows, convert to an Event HANDLE wglConvertSyncToEvent(object sync); • Need to specify, when sync object is created, that it can be converted to OS event • Separate extension: WGL_ARB_sync_event • On Unix, convert to a file-descriptor, x-event or semaphore? • Still TBD
Possible future functionality • Add a WaitForMultipleSync(uint *sync_objects, ….) command • Synchronize with multiple sync objects at once • Add a “payload” to a fence • For example, the time it completed • Allow one GPU stream to wait for another GPU stream • WaitSync(sync_object); • A sync object whose status will pulse with every vblank • A sync object that can signal when data binding has completed • As opposed to when rendering has completed using the data
Example – Streaming video processing • Loop Draw frame 1 // To a FBO, for example glFence(sync_object1);// inserts a fence in the command stream Draw frame 2 glFence(sync_object2); while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking Read back data in frame 1 while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking Read back data in frame 2
Variation with asynchronous read back • Loop Draw frame 1 // To a FBO, for example Read back frame 1 into PBO 1 // Asynchronous readback glFence(sync_object1);// Inserts a fence in the command stream Draw frame 2 Read back frame 2 into PBO 2 glFence(sync_object2); while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking glMapBuffer(…); // Access the data of frame 1 in PBO 1 while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking glMapBuffer(…); // Access the data of frame 2 in PBO 2
Differences with GL_NV_Fence • No separation of sync objects and fences in NV_Fence • NV version only has fence objects • Fence object has state • Creation of sync object and inserting a fence in one command • SetFenceNV creates and inserts a fence (old object model) • NV Fence objects not shared across contexts
API Overview 1/2 • Create a sync attribute object object CreateSyncAttrib(); • SYNC_TYPE has to be FENCE • SYNC_CONDITION has to be SYNC_PRIOR_COMMANDS_COMPLETE • SYNC_STATUS SIGNALED or UNSIGNALED • Create the sync object object CreateSync(object attrib); • Insert a fence, associated with a sync object, into command stream void Fence(object sync);
API Overview 2/2 • Wait or test the status of a fence sync object enum ClientWaitSync(object sync, uint flags, time timeout); • Blocks until sync is signalled or timeout expired • If timeout == 0, does not block, returns the status of sync • If timeout == FOREVER, call does not timeout • Optionally will flush before blocking • Returns 3 values: ALREADY_SIGNALED, TIMEOUT_EXPIRED, CONDITION_SATISFIED • Signal or unsignal a sync object void SignalSync(object sync, enum mode); • If status transitions from unsignaled to signaled, ClientWaitSync will unblock
Example – Streaming video processing • Loop Draw frame 1 // To a FBO, for example glFence(sync_object1);// inserts a fence in the command stream Draw frame 2 glFence(sync_object2); while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking Read back data in frame 1 while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking Read back data in frame 2
Variation with asynchronous read back • Loop Draw frame 1 // To a FBO, for example Read back frame 1 into PBO 1 // Asynchronous readback glFence(sync_object1);// Inserts a fence in the command stream Draw frame 2 Read back frame 2 into PBO 2 glFence(sync_object2); while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking glMapBuffer(…); // Access the data of frame 1 in PBO 1 while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking glMapBuffer(…); // Access the data of frame 2 in PBO 2