450 likes | 753 Views
Data Management Techniques. Sung-Eui Yoon KAIST. URL: http://jupiter.kaist.ac.kr/~sungeui/. Data Avalanche (or Data Explosions). There are too much data out data!!!. www.cs.umd.edu/class/spring2001/ cmsc838b/Project/Parija_Spacco/images/. Geometric Data Avalanche. Massive geometric data
E N D
Data Management Techniques Sung-Eui Yoon KAIST URL: http://jupiter.kaist.ac.kr/~sungeui/
Data Avalanche (or Data Explosions) There are too much data out data!!! www.cs.umd.edu/class/spring2001/ cmsc838b/Project/Parija_Spacco/images/
Geometric Data Avalanche • Massive geometric data • Due to advances of modeling, simulation, and data capture techniques • Time-varying data (4D data sets)
CAD Model: Double Eagle Oil Tanker 82 million triangles (4 gigabyte)
CAD Model: Boeing 777 Ray Tracing Boeing 777, 470 million triangles Excerpted from SIGGRAPH course note on massive model rendering
Scanned Model: ST. Matthew Model 372 million triangles (10GB) www.cyberware.com
Possible Solutions? • Hardware improvement will address the data avalanche? • Moore’s law: the number of transistor is roughly double every 18 months
Current Architecture Trends Data access time becomes the major computational bottleneck! Accumulated growth rate during 1999~2009 (log scale) access speed disk access speed
Four Orthogonal Approaches • Cache-coherent layouts • Random-accessible compressed meshes • Cache-oblivious ray reordering • Hybrid parallel continuous collision detection
Overview • Cache-coherent layouts • Random-accessible compressed meshes • Cache-oblivious ray reordering • Hybrid parallel continuous collision detection
va vc vb vd va vb vd vc Cache-Coherent Layouts of Meshes • One dimensional data layout of a mesh • Reduce the number of cache misses • Cache-aware or cache-oblivious layouts • Minimize the number of cache misses for a specific or various cache parameters (e.g., cache block size) One dimensional layout [Yoon et al. SIG05, VIS06, Euro06]
Block transfer Block-based I/O Model [Aggarwal and Vitter 88] Fast memory or cache Slow memory CPU or GPU Disk 1 sec 10-6 sec 10-4 sec Access time:
Applications • View-dependent meshes • View-dependent rendering • Triangle meshes • Isocontour extractions • Hierarchies • Ray tracing • Collision detection
View-Dependent Rendering using LODs Improving GPU vertex cache Utilization GeForce 6800 (January 2005)
Applications • View-dependent meshes • View-dependent rendering • Triangle meshes • Isocontour extractions • Hierarchies • Ray tracing • Collision detection Puget sound, 134 M triangles Isocontour z(x,y) = 500m • Achieve up to 20X improvement on iso-contouring
Applications • View-dependent meshes • View-dependent rendering • Triangle meshes • Isocontour extractions • Hierarchies • Ray tracing • Collision detection • Achieve 30% ~ 300% performance improvement
Advantages • General • Works well for various applications • Cache-oblivious • Can have benefit for all levels of the memory hierarchy (e.g. CPU/GPU caches, memory, and disk) • No modification of runtime applications • Only layout computation Source codes are available as a library called OpenCCL
Overview • Cache-coherent layouts • Random-accessible compressed meshes • Cache-oblivious ray reordering • Hybrid parallel continuous collision detection
Random-Accessible Compressed Data • Compression methods of meshes and hierarchies • Reduce the memory requirements • Supports random accesses on meshes and hierarchies • Can be useful to many different applications [Kim et al. Tech. Report 09; Kim et al., TVCG 09; Yoon and Lindstrom, VIS 07]
Hierarchical-Culling oriented Compact Meshes (HCCMeshes) • Consists of two parts: • i-HCCMeshes (in-core representation) • o-HCCMeshes (out-of-core representation)
Data Access Framework Main memory Data pool Request User Data
Data Access Framework- Out-of-Core Technique Main memory Cached data External drive Cluster c0 cluster ID Cluster c1 Request Cluster c2 Data pool User Cluster c3 Data Cluster c4 cluster Cluster c5 … Cluster cn
HCCMeshes Support hierarchical random access! Main memory Cached data External drive cluster ID Request Data pool User Decomp. Data Cluster c0 Compressed Data Cluster c1 Cluster c2 Cluster cm compressed cluster Cluster c3 Decomp. cluster Cluster c4 Cluster c5 Cluster c6 Cluster c7 o-HCCMesh i-HCCMesh Cluster c8 Cluster c9 Cluster c10 Cluster c11 Cluster c12 Cluster c13 …
Main Benefits • Use a lower memory space and working set size • o-HCCMeshes have 20:1 compression ratios • i-HCCMeshes have 6:1 compression ratios • Improve runtime performance
Applications • Whitted-style ray tracing • LOD-based ray tracing • Collision detection • Photon mapping • Non-photorealistic rendering Source codes are available as OpenRACM
Overview • Multi-resolution representations • Random-accessible compressed meshes • Cache-oblivious ray reordering • Hybrid parallel continuous collision detection
Challenges • Secondary rays generated show low ray coherence • Result in low cache utilizations • In case of ray tracing massive models, expensive cache misses occur (e.g. L1/L2, main memory) Landscape ( >1000 M ) St.Matthew ( 372 M )
Goal • Design an efficient algorithm for converting incoherent secondary rays to coherent • Achieve a high cache coherence of these rays • The performance improvement of ray tracing
Ray Reordering Framework Hit points and material information Caches Ray generation Ray processing Ray reordering Camera information L1 Main memory Disk Scene information Ray buffer [Moon et al., under review]
Applications • Path tracing • Photon mapping
Result – Path Tracing (Video) • 104 M triangles • (12.8 GB) • 512*512 resolution • 100 path • 8 area lights
Result – Photon Mapping • 128 M triangles • (15.7 GB) • Cache 19% of all the data • 4 area lights • 13 X speedup
Overview • Multi-resolution representations • Random-accessible compressed meshes • Cache-oblivious ray reordering • Hybrid parallel continuous collision detection
Collision Detection • Collision detection is used in various fields • Game, movie, scientific simulation and robotics <Figure from C. Lauterbach > <Figure from PIXAR> <Figure from AION >
Discrete VS Continuous Discrete collision detection (DCD) Time step (i) Time step (i-1)
Discrete VS Continuous Continuous collision detection(CCD) Time step (i) Time step (i-1)
Discrete VS Continuous Discrete collision detection (DCD) ? Time step (i) Time step (i-1)
Motivation • Continuous collision detection • Accurate, but slow for complex models • Hardware trend • CPUs and GPUs are increasing the # of cores • Heterogeneous architectures • Intel Larabee architecture • Previous approaches • Utilize either multi-core CPUs or GPUs • Not enough performance for interactive applications
Hybrid Parallel CCD [Kim et al. PG 09] • Takes advantages of both: • Multi-core CPU architectures • GPU architectures • Achieves interactive performance for various deforming models consisting of tens or hundreds of thousand triangles GPU Multi-core CPU Multi-core CPU CCD … … GPU
Results • Performance of HPCCD utilizing both CPUs and GPUs Source codes are available as a library called OpenCCD
Conclusions • Data explosion and lower growth rate of data access time • Discussed three different techniques as a data management method • Cache-coherent layouts • Random-accessible compressed data • Cache-oblivious ray reordering • Hybrid continuous collision detection • Applied to rendering and collision detection • Observed meaningful performance improvement
Acknowledgements • Research collaborators • TaeJoon Kim, DukSu Kim, Pio Claudio, BooChang Moon, YongYoung Byun, JaePil Heo, SeungYong Lee, YongJin Kim, JaeHyuk Heo, John Kim, Peter Lindstrom, Valerio Pascucci, Dinesh Manocha • Funding sources • Microsoft Research Asia • KAIST seed grant • Ministry of Knowledge Economy • Samsung • Korea Research Foundation