510 likes | 763 Views
" The evolution and future of Deferred Rendering methods ". Przemysław Witkowski Drago Entertainment. Outline. Introduction Forward shading Deferred shading Deferred lightning / light pre-pass Inferred lightning Light-indexed deferred shading Decoupled deferred shading
E N D
"Theevolution and future of Deferred Rendering methods" Przemysław Witkowski Drago Entertainment
Outline • Introduction • Forwardshading • Deferredshading • Deferredlightning / lightpre-pass • Inferredlightning • Light-indexeddeferredshading • Decoupleddeferredshading • Prelighning • Afterlights • Tilebaseddeferredshading • Tilebasedforwardshading • Forward+ • Summary
Forwardshading • First usedtechniquein 3d graphics and theonly one thatcould be implementedinreal time withoutprogrammablepipeline • Sometimescalled „traditional” or „immediate” shading • Eachobjectisshadedindependently • Eachlight - objectinteractionishandledindependently
Pros: • simple and intuitive • hardware antialiasingsupported • easy to implementmultiple materials and lightningmodels Cons: • poor performance withmultiplelights • performance lost due to shading unnecessarypixel • shadersused to handle allcombinations of various geometry types and lighttypesarevery big, complex and hard to maintain • smalltrianglesshadingisinefficient
Twoways to handle multiplelights SPML (single pass – multiplelights) for (eachvisibleobject) for (eachlight) calculate_shading() MPML (multiple pass – multiplelights) for (eachlight) for (eachvisibleobject) calculate_shading()
Z Pre-Pass (early Z culling) • used to reduceproblemswithoverdraw • reject pixels before they get shaded • musthave for modern forwardrenderers • not so importantindeferredshading and deferredlightning
Deferredshading • Decouple geometry calculationsfromshading • Deffershadingcalculations and performtheminscreen-space for ( eachobject) {savematerial’s and geometry propertiesintexture} for ( eachlight) { Framebuffer += shade(light, g_buffer) }
Geometry phase - Render scene without lighting to GBuffer (geometry buffer) - Eachsampleholds everything we need to computelighting: Position (Depthisenough) , Normal, Material properties (color, specularexponent, … ) - Use MRT to render to multipletexturesatthe same time • Lightningphase - Performedinscreen-space - Fetch data fromg-buffer and applyselectedlightning model - For eachlightaccumulatecalculatedvalueinlighttextureusingadditiveblending
Rendertargets Diffuse Normal Depth
Rendering lights • Directionallight: draw quad over whole screen • Point lightsand spot lights: drawlightboundingvolume geometry (sphere for point light and cone for spot light) orscreenalignedquads • Draw back faces of thelightvolumeifthe camera isinside • We canusestencilbuffer to markpixelsthatareaffected by light • We canuseinstancing to renderalllight geometry atonce • Withscreen-alignedquads we havelots of wastedcomputations but thepositionreconstructionischeaper
Deferred Rendering pros • Shadingcalculations and shadingperformance are independent fromscenegoemetry • Decouplingmaterial and lightningcode • Simpler management and simplification of the rendering pipeline • G-buffercan be used for post-processing • Pixel shader gets executed only on visible parts • No waste on shadingsmalltriangles
Deferred Rendering cons • No hardware antialiasingsupportin dx 9 • Incompatiblewith lit translucentobjects • It’sdifficult to implementvarious materials and lightningmodels • Bandwidth overheadwhen lights overlap • Stilllimited in the number of lights that can cast shadows • Lightsbleedeingproblems • Lightgroupsarehard to implement
Deferred lightning / light-pre pass • Developed to overcomedeferredshadingproblems: • mainly to reduceG-bufferoverhead (bandwith) and materialvarietylimitations • UsedinCrysis 2 • Renderonlynormals and depthin geometry stage • Storelightpropertiesinlightbuffer (N.L, R.V, color, attenuation) • Rendersceneagain (forward pass) – this time with materials and calculatedlightning • Eachpixel of thelightbufferrepresentsthespecular term (intensity) of alllightsources (informationabouteachlightcontributionislost)
Deferred lightning / lightpre-pass Pros: • Requires less memory and less bandwidththandeferredshading • Works slightlybetterwith hardware AA • Works slightlybetterwithmultiple materials • Can be implementedwithoutMRTs Cons: • We need to renderscenetwotimes • Advanced materials need to knowangle of incidence for everylightwhichislostinlightningstage • Specularcontributionsareblended (implementationsareusually limited to monochromaticspecular) • Lightningartifacts on multisamplededges
Inferredlightning • Similar to lightpre-pass but developedindependentlyatViolition • usedin „Red Faction: Armageddon” and „SaintsRow: The Third” • Usesmixed resolution rendering • G-buffer and lightningaredoneatlower resolution • Usescustom DSF filter for upsampling
Inferredlightning (2) Threepasses : • geometry pass: rendernormal, depth and dsf data • light pass: calculate and storelightninginlightbuffer (4x16 bit channels, diffuselightningin RGB, accumulatedintensity of specularinAlpha) • material pass: „forward” pass usingfullmaterialshaders but sampleslightningfromlight-bufferusingdsf (no additional pass for translucentobjects) A stipple pattern (2x2) is applied when rendering transparent objects inmaterial pass for translucentobjectssamplingshould be donewithproperpattern and missing data should be computedusingdsf
Discontinuitysensitivefiltering (DSF) • Performedinpixelshaderinmaterial pass • Uses 16 bits id valuestoredin geometry pass • Consists of 8 bits of object id, and 8 bits of normal-group id • Sample 4 pixelsfromlight-buffer, comparetheirdepth and id withcurrentlyrenderedobject • Discardsamplesthat do not belong to currentsurface • Applycustombilinearfiltering • Possibleartifactswhenallfoursamplesarerejected!
Inferredlightning (2) Pros: • Greatermaterialflexibilitythandeferredshading • Compatibilewith MSAA • Unifiedpipeline for processingalpha-blendedobjects • Reducedmemorybandwidth and pixelshadingcost Cons: • Transparent objectsarebeing lit atevenlower resolution • Only (?) 3 layers of transparency • Bad shadowsquality on translucentobjects • Lower normalmapsquality • Upsamplingcan be costly
Lightindexeddeferredshading • Introduced by Damian Trebilco • Stores thelightpropertiesateachpixel and usetheminforward pass • Eachlighthasassigneduniqueindexthatisstoredateachpixelthelighthits
Lightindexeddeferredshading (2) Threemainpasses: • Depthonly pass • Renderlightvolumesintolightindextexture (withdepthwritingdisabled) • Renderscene geometry usingforward pass whichcanaccesslightningproperties by fetching data fromlightindextable (usingtexturesorconstantbuffersin dx 10)
Lightindexeddeferredshading (3) Overlappinglightsareproblematic 3 lightindexpackingschemeswereproposed(cpu orgpubased): • first method – cpusorting – sort lightsbaed on lightvolumeoverlap • Secondmethod – gpumulti-pass max blend equation • Third method – bit shifting – gpusolution Withthesemethods we have a lot of differentlightoverlapcounts and scenelightcountcombinations
Lightindexeddeferredshading (4) Cons: • Total lightsnumber and totalnumber of overlappinglightsare limited • It’sverydifficult to integratetechniquewithshadows • Supportsonly one lighttype • Passinglightindexes data can be painful Pros: • Efficientmiddlegroundbetweenforward and deferred • Lightgroupsareeasy to implement • Forwardrendererscan be easlymodified to use LIDR • No problemswithmultiple materials
Decoupleddeferredshading • Introduced by Gabor Liktor and CarstenDachsbacher • Based on new data structurecalled „Compact geometry buffer” • Main idea is to savecomputations by decouplingshadingsamplesfromvisibilitysamples • Efficientcaching and shadingsamplesreusecan be implemented • Especiallyuseful for multisampling and stochasticrasterizationtechniques
Decoupleddeferredshading • In rasterization, visibility samples are typically fragments covered by triangles in screenspace • eachvisibilitysamplehasassignedshadingsample – thismappingcan be many-to-one and thenshadingisreused • Visibility and shadingsamplesaresampledindifferentdomains Source: http://cg.ibds.kit.edu/publications/p2012/shadingreuse/shadingreuse_preprint.pdf
Compact geometry buffer • The same functionality as theg-buffer • Instead of storingshadinginformationinframebuffer, eachvisibilitysample stores a reference to a shadingsamplein compact linearbuffer • Multiple visibility samples can reference the same shading sample to allow shading reuse • The size of the geometry buffer does not grow with the supersampling density
Decoupleddeferredshading • Shadingsamplesmusthaveassigneduniqueindexwhichhas to be uhniqueduringrendiring pass • During rendering eachshadingprimitiveallocatessomeindexes - theirnumberismaximumnumber of pixelsthe triangle wouldcoverinshadingdomain • Ifshading data isfound we onlyneed to store pointer to it’saddress • Otherwise we allocate a new slot in compact buffers, store data there and referenceit • On current hardware we need to implement global cache for fetchingshadingsamplesorsplitscreenintotiles and useper-tilecache
Prelightning • DevelopedatInsomniacGames by Mark Lee • Verysimilar to techniquedevelopedatNaughty Dog for Uncharted • Create a framebufferwhich stores normal per pixel and another for diffuse per pixel, alsoneed to storegloss and specularpower • Main idea is to accumulatecalculatedlightningfromalllights and storeitintwohdrbuffers: diffuse and specular • Theyareused by regular rendering pass
Prelightning (2) Threemainpasses: • Depth-normal pass – writesintoframebuffer and depthbuffer, rgb = viewspacenormal, alpha = materialspecular (normalsarestoredinviewspace as 3x8bit components • Prelightning pass – lightningfrom point and spot lightsiscalculatedhereusingscreenspacequads, projectedtexturesarerenderedintodiffuselightbuffer • Regular geometry pass (forward pass) – lookupthelightningbuffers, diffuselightismodulatedwithbase map, specularismodulated by gloss map
Prelightning (3) Pros: • Easy to fit intoexistingforwardrenderer • Lower memory and bandwidthusagethannormaldeferred Cons: • Two geometry passes • Problemswithimplementingcomplex materials • Stillproblematicalphablending
Afterlights • DevelopedatRealtimeWorlds and usedin Xbox title: „Crackdown” • Theywereused for streetlights, car headlights and ineffectslikeexplosion • Usesdepth-normal pass likelightpre-passrenderer • While rendering opaquesurfaces, alphais set to valueproportional to thebrightnes of thesurfacecolor (luminance) • „afterlights” are applied aftermaincolor pass • Pixelshadersamplesdepthbuffer and per-pixelnormaltextures and computesnormal and position for currentpixel
Afterlights (2) • whileblendingintoframebufferlightsaremodulated by thedestinationalpha • Cubeisused to render point lights • Screenspaceellipsewithaxisalongheadlightdirectionisused for spot lights • To avoidfill-rateproblemslightsarerendered to a half-sizeoff-screenbuffer
Afterlights (3) Pros: • Depth/normalbuffersaresharedbetweenmultiple systems • They make use of lightbufferalpha channel whichusuallygoes to waste • Integratesnicelywith hardware blending • work goodwhen used for decorations and effects Cons: • Storingonlyluminance of the albedo onlydarkenslights • No specularlightning • Lightbufferhas to have an alpha channel • Poorlightningquality and possibleartifacts
Tilebaseddeferredshading • Goal: amortize overhead / solvebandwidth problem • Bucketlightsintoscreenspacetiles – each of themcontains list of affectinglights • Usefrustumto cull non-intersecting lights • Read G-buffer once for each fragment and evaluate all relevant lights • Commontermsin rendering equationcan be factored out • Lightaccumulationisdonein register
Tilebaseddeferredshading Source: http://www.cse.chalmers.se/~olaolss/images/tiled_shading_shot.jpg
Tilebaseddeferredshading Source: http://www.cse.chalmers.se/~olaolss/images/tiled_shading_grid.jpg
Algorithm • Renderopaque geometry intog-buffer (same as deferred geometry stage) • Createscreenspacegrid (tiles) on cpuorgpu • For eachlight: calculatewhichgridcellitaffects (by calculatingscreenspaceextents of lightvolume) and addlight’s id to theper-tile list • For eachpixel: sampleg-buffer, accumulatelightning for alllightsintile’s list, outputaccumulatedlightning
Tilebaseddeferredshading Cons: • Stillhugeframebuffersizewith MSAA • Shadowmapsmust be builtbeforeshading (moreg-buffersampling = worseshadow rendering performance) • Growinglightoverdrawhas big impact on performance • Efficientimplementationneeds Cuda orComputeShader • Same problem with transparent objects
TilebasedForwardshading • Latelyit was discoveredthattiledmethodis not limited to deferredshading :P • Thealgorithm for creatinglightgridisthe same as intileddeferredshading • In pixelshader list of lightsisfetchedfromselectedtile • Need to findproper data structure. • Threearraysmethodisproposed: • Global Lightlist holdslightproperties • TileLightIndex list containslightindices to the global light list • Lightgridcontains an offset and size of thelight list for eachtile • Arrayscan be stored on thegpu as texturesorconstantbuffers
TilebasedForwardshading(2) Pros: • Light management isdecoupledfrom geometry • No problemswithantialiasing • Transparent objectscan be shadedthe same way as opaque • Light data can be uploaded to thegpuonce per scene • No problemswithmultiple materials and lightningmodels • Integratingintoexistingforwardrendererissimple • Cuda orDirectcomputecan be used for efficientlightculling • Potentiallylowermemoryfootprint Cons: • Each fragment may be shadedmorethanonce • Small-trianglesshading problem • Usuallyneeds Z pre-pass so scenemust be renderedtwice • Scalesworsewithincreasinglightoverdrawthantileddeferredshading
Forward+ • UsedinlatestAMD’s demo called „Leo” • Not fullydocumented • Looksverysimilar to lightindexeddeferredshading • Lightsareculled by ComputeShader • UAVsareused to handle lights list • Onlylightsthatcontribute to thefinalpixelareselected and passed to thepixelshader
Summary • Deferredshadingisallaboutcaching and doingsome of thecalculationsinscreenspace • Thebiggestadvantage of deferredshadingconceptisflexibility: we candecidewhichcalculationsshould be doneimmediately and whichshould be deferred • As Andrew Lauritzen [5] said: ”The point in deferred shading these days is not really for lighting. The real reason to evaluate some terms up front and others later is so that you can reschedule the later computations in various ways” • Differences between deferred and forwardshadingbecomeblurry • Modern forwardrenderermustuse Z pre-pass so it’salreadyusingcachingwhichisthemaincharacteristic of deferredshading • Tiledforwardrendereriscachinglight data also • Mixedsolutionsaretheway to go for now
Summary (2) • „deferred shading is the current future, but not the future future„ • Next-genconsoles and next-gengraphicscardscanchangethecourse • "Decoupledsampling„ and "Stochasticrasterization„ soundinteresting • We need to varylightningmodelswhichishard/impossibleindeferredshading • We needmoredetails – normalmapsare not enough (not evenclose) – but withgrovingverticescount (+tesselation) rendering geometry twicebecomesoverkill • + thereisstill problem withsmalltrianglesinforwardshading
Summary (3) • Thereis no universalsolution • Even on next-gen hardware there will be someserious hardware limitations • We will probablyseeevenmoreforward/deferredmodificationstrying to overcomethem • Designing and implementingrenderers will be morefunthanever
Bibliografia • DEFERRED • http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter09.html • Shawn Hargreaves and Mark Harris. 6800 Leagues Under the Sea - Deffered Shading. p, 2004. http://download.nvidia.com/developer/presentations/2004/6800_ • Leagues/6800_Leagues_Deferred_Shading.pdf. • Martin Mittring, “A bit more Deferred – Cry Engine 3”, http://www.slideshare.net/guest11b095/a-bit-more-deferred-cry-engine3 • Michal Valient. Deferred Rendering in Killzone 2. 2005. http://www.guerrilla-games.com/ • publications/dr_kz2_rsx_dev07.pdf. • Johan Andersson – “Parallel Graphics in Frostbite – Current & Future” –– Siggraph 2009 • http://diaryofagraphicsprogrammer.blogspot.com/2008/03/light-pre-pass-renderer.html • W. Engel. Designing a Renderer for Multiple Lights: The Light Pre-PassRenderer. In ShaderX7: Advanced Rendering Techniques. Charles River Media, Boston, MA, 2009. • Frank PuigPlaceres, “Overcoming Deferred Shading Drawbacks,” pp. 115 – 130, ShaderX5 • Andrew Lauritzen. Deferred rendering for current and future renderingpipelines. SIGGRAPH Course: Beyond Programmable Shading, 2010. • Matt Swoboda. Deferred lighting and post processing on playstation 3. • Game Developer Conference, 2009.
INFERRED • Kircher, Lawrance - Inferred Lighting: Fast dynamic lighting and shadows for opaque and translucent objects • Kircher - Lighting & Simplifying Saints Row: The Third • Mike Flavin - Lightning the apocalypse: rendering techniques in red faction: armageddon • PRELIGHTNING • Mark Lee - Pre-lightning in Resistance 2 • Mark Lee - Prelightning • LIGHT INDEXED • http://mynameismjp.wordpress.com/2012/03/31/light-indexed-deferred-rendering/ • http://code.google.com/p/lightindexed-deferredrender/ • http://mynameismjp.wordpress.com/2012/03/31/light-indexed-deferred-rendering/ • DECOUPLED • Gabor Liktor, CarstenDachsbacher – 2011 – „Decoupled Deferred Shading for hardware rasterization” • TILED SHADING • http://aras-p.info/blog/2012/ • http://www.cse.chalmers.se/~olaolss/main_frame.php?contents=publication&id=tiled_shading • http://www.pjblewis.com/articles/tile-based-forward-rendering/ • http://www.cse.chalmers.se/~olaolss/main_frame.php?contents=publication&id=tiled_shading