1.36k likes | 1.87k Views
DirectX ® And Streaming Video Drivers Jeff Noyle, Development Lead Gary Sullivan, Software Design Engineer William Messmer, Software Design Engineer Eric Rudolph, Software Design Engineer Microsoft Corporation. Speakers.
E N D
DirectX® And Streaming Video DriversJeff Noyle, Development LeadGary Sullivan, Software Design EngineerWilliam Messmer, Software Design EngineerEric Rudolph,Software Design Engineer Microsoft Corporation
Speakers • “DirectX Graphics Drivers,” Jeff Noyle, Lead Developer, DirectDraw®/Direct3D®, Microsoft Corporation • “DirectX VA Video Acceleration Drivers,” Gary Sullivan, Software Design Engineer, DMD Video Services Group, Microsoft Corporation • “Writing AVStream Minidrivers for Windows® XP,” William Messmer, Software Design Engineer, Digital Audio-Video, Microsoft Corporation • “Testing Your WDM Driver with DirectShow®,” Eric Rudolph, SDE, DirectShow Editing Services, Microsoft
DirectX Graphics DriversJeff NoyleDevelopment LeadDirectDraw/Direct3DMicrosoft Corporation
Prerequisites • I’m assuming • Basic familiarity with DirectDraw and Direct3D concepts: • System Architecture • Surfaces • Page flipping • The DDK can be hard to read
Agenda • Single-source issues • Windows 9x issues • OS-independent issues • DirectX 7.0 implementation details • Changes in DirectX 8.0 • What can you do next?
Single-Source IssuesStuff you should know if you want one code-base to support Windows 9x OS versions and Windows NT® OS versions
Allocating System Memory Per-Surface • (Do NOT use this process to allocate surface memory itself...See later) • Normally system memory is charged against a particular process • Can’t free it in some other process (as in ctrl-alt-del mechanism) • Use EngAllocPrivateUserMem and EngFreePrivateUserMem • Uses DirectDraw object to locate proper process context
YUV/FOURCC Surfaces • System memory YUV/FOURCC • surfaces on NT systems • DirectDraw Kernel-mode “pretends” that these surfaces are 8bpp RGB for the purposes of allocating memory • DXTn: • Height: height in 4x4 blocks • Width: width in blocks * sizeof(block) • You must undo these transformations at CreateSurface time
YUV/FOURCC Surfaces • NT kernel mode doesn’t understand any FOURCC formats, so: • The driver must handle video memory allocation for these types • The driver must handle Lock forthese types
Windows 2000 Issue (Fixed In Windows XP) • During allocation of an AGP surface... • If the driver fails to allocate and: • returns DDHAL_DRIVER_HANDLED • AND sets an error code in ddRVal • AND sets the surface’s lpVidMemHeapto non-zero • Then the system will ignore the error • So NULL the lpVidMemHeap on error!
Atomic Surface Creation • On Windows 9x, drivers are givena list of surfaces • On Windows NT, drivers are given surfaces one-at-a-time, unless: • Driver reports GUID_NTPrivateDriverCaps • and sets DDHAL_PRIVATECAP_ ATOMICSURFACECREATION
Windows NT Extra • You can use the GUID_NTPrivateDriverCaps to request notification of primary surface: • Set DDHAL_PRIVATECAP_ NOTIFYPRIMARYCREATION
System-To-Video Blts • To speed up some titles, implement system-to-video blts • All you need to implement is SRCCOPY, no stretch • But you should implement sub-rects • DirectDraw assumes your driver requires system memory to be pagelocked during Blt • If this is not true, set DDCAPS2_NOPAGELOCKREQUIRED
HeapVidmemAllocAligned • It’s an “Eng” function in Windows NT versions • It’s a ddraw.dll export in Windows 9x • You can use this to allocatesurface memory • You must have passed the heap to DirectDraw previously • You must fill in the fpHeapOffset, fpVidmem and lpVidmemHeapof the surface
Heap Offsets Explained Return values from HeapVidmemAllocAligned are these offsets: fpEnd (points TO last byte) Heap (Note fpStart is set to 0x1000 by DirectDraw for AGP heaps) Surface Return value from HVMAA and fpHeapOffset fpStart “0”
DDSCAPS_VIDEOMEMORY • Remember that this includes AGP unless combined with DDSCAPS_LOCALVIDMEM • At GetAvailDriverMem time,a request that specifies DDSCAPS_VIDEOMEMORY (and not any explicit type: local or non-local) should include both types in the total
GetScanLine • Implement this, if you can! • DirectX 8.0 uses it a lot for presentation-Blt timing • Set DDCAPS_READSCANLINE, so DirectX 8.0 knows
CreateSurfaceEx • More on this later • NEVER fail CreateSurfaceEx for system memory surfaces, even if you don’t understand the pixel format • Just return DDHAL_DRIVER_HANDLED and DD_OK • (Otherwise new system-memory formats used by the reference rasterizercan’t be created)
Alpha-In-The-Primary • If your driver can do this in 32bpp: • Create an A8R8G8B8 render target • Blt that to the primary surface IGNORING the alpha channel • (And stretch/shrink (please)) • Then you should set: • DDHALINFO.vmiData.ddpfDisplay. dwFlags |= DDPF_ALPHAPIXELS • DDHALINFO.vmiData.ddpfDisplay. dwRGBAlphaBitMask = 0xFF000000
Windowed Applications And Blt Queuing • Don’t allow “many” presentation-bltsin your queue • That is, don’t allow a large latency between scheduling and retiring a presentation-blt • WHQL enforces low latency for DirectX 8.0 drivers • Check DDBLT_PRESENTATION, and don’t allow more than three • More info in ddraw.h
DDBLT_WAIT And DDBLT_DONOTWAIT • Drivers should never look at these • They are set by the application/ DirectDraw runtime • They are handled by the DirectDraw runtime • Sometimes DirectDraw spins, and wants to do that in user-mode • Applies to DDFLIP_WAIT as well
DDBLT_ASYNC • Ignore this flag • Always perform your blts asynchronously, if possible
What Are DDROPS? • We don’t know either • An idea of the original designer of DirectDraw, but never implementedor specified • In short: ignore!
Blt And YUV Surfaces • DirectShow can gain performance benefits if it knows it can use Blt to copy Overlay surfaces • Check to see if you can support DDCAPS2_COPYFOURCC • This means you can SRCCOPY, no sub-rects, no stretch, no overlap between two FOURCC surfaces of the same type
Update Overlay, Etc. • If multiple overlays are created, but you have hardware for only one: • Succeed all CreateSurface calls • Fail the UpdateOverlay call
Flip Flags • DDFLIP_NOVSYNC • This means: flip immediately; do not wait for vertical blank • The hardware must be capable of re-latching the new primary surface address immediately, or at least on thenext scanline • In other words, don’t allow the remaining raster scans to read from the oldback buffer
Flip Flags • DDFLIP_INTERVALn • Please don’t implement by busy-waitingin the driver • But please do implement if your hardware can defer flips for n frames
Gamma Ramps • DirectDraw and Direct3D’s gamma ramps are passed through the GDI DDI call SetDeviceGammaRamp • This call is poorly prototyped • This is the struct you will be passed: struct { WORD red[256]; //WORDs not BYTEs WORD green[256]; WORD blue[256]; };
Overview Of DirectX 7.0 Model • Direct3D refers to surfacesvia “handles” • Driver keeps a look-up table indexedby handle • Driver keeps everything it needs to know about a surface in this table
CreateSurfaceEx • Called after CreateSurface • Assigns a Direct3D-allocated handle to the surface(s) • Driver runs attachment lists, creates internal structures for eachsurface in list
CreateSurfaceEx Is Hard • Driver has to run surfaceattachment list • Z buffer might be attached, orseparate surface • Cubic Environment Maps arethe hardest...
Positive X Negative X Positive Y Mip Sub- Level Mip Sub- Level Mip Sub- Level Cubemap Attachments(Abstract View) ... ... ... ...
Positive X Positive Y Negative X lpAttachList lpAttachList lpAttachList lpLink lpLink lpLink lpLink lpLink lpLink lpAtt.. lpAtt.. lpAtt.. lpAtt.. lpAtt.. lpAtt.. Cubemaps (Struct View) + X Mip lpAttachList + X Mip - X Mip lpAttachList
Drivers Cannot • Keep pointers to DirectDraw’s surface structures in their own structures • Flip confusion (explained later) • Overhead • Under DirectX 8.0, we don’t keep the DirectDraw structure • ...So DirectX 8.0 drivers CAN’T store pointers – they will crash
Flip Confusion Explained User Mode Front Buffer Handle A User Mode Back Buffer Handle A Before Flip: Driver Surface A Driver Surface B
After Flip User Mode Back Buffer Handle A User Mode Front Buffer Handle B The user-mode structures now refer to different pieces of memory. => You cannot store pointers to the user-mode structs in the driver structs. Driver Surface A Driver Surface B
Aliasing: What It Is • Video memory is a shared resource • On mode switch, all must be given up • But the application may be writing directly to video memory • We re-map the application’s view of video memory to a dummy page, then allow the mode switch to proceed • Only done at app’s request: DDLOCK_NOSYSLOCK
Aliasing: How It’s Done • When the driver returns a pointer to video memory at CreateSurface time: • The offset into the frame buffer is calculated, and then an equivalentaliased pointer is returned tothe application • If the pointer lies outside of video memory, no aliasing is done (we don’t knowenough to do so)
Aliasing: How To Break It • On Windows NT systems, the driver must NOT return a pointer outside of video memory at Lock time • This pointer will not be aliased • The application will crash if a modeswitch happens • Drivers should allocate system memory at CreateSurface time (PLEASE_ALLOC_USERMEM)
Driver Capabilities Are Constant Across Modes • This means everything in D3DCAPS8 • The caps are allowed to be “nothing” in some modes, e.g., 24bpp • You are allowed to support different back buffer formats • That is, the one that matches thefront buffer
Vendor ID (0=Microsoft) Nonzero Format (Use your PCI Vendor ID) => FOURCC Number Pixel Formats In DirectX 8.0 • Goodbye DDPIXELFORMAT • Hello D3DFORMAT • All FOURCCs are D3DFORMATs • D3DFMT has this form Byte 3 Byte 2 Byte 1 Byte 0
D3DFORMAT Examples • D3DFMT_A1R5G5B5 • 0x00000019 • IHV-defined Format • 0xACAT0001 • (PCI ID 0xACAT, not FOURCC, format 1) • FOURCC “UYVY” • 0x55595659 • (Byte 2 is non-zero)
IHV-Def’d Texture Formats • Since Direct3D doesn’t understand • These formats cannot be “managed” • Applications can lock thesesurfaces directly • (In fact this is the only way to fill such surfaces with data)
DirectX 8.0 Format Op-list • The format op-list tells DirectX 8.0 everything about capabilities thatvary with surface format • For each format, the driver sets bitsthat indicate: • Can Texture from this format • Render to this format • Switch display mode to this format • Has caps in modes of this format
Format Op-List Tricks • The runtime searches for the first entry that has all required capabilities • Example: Application wishes to render to 565 texture • Runtime will search for an Op-Listentry with: • D3DFORMAT_OP_TEXTURE | D3DFORMAT_OP_OFFSCREEN _RENDERTARGET