Data Management Techniques

Data Management Techniques Sung-Eui Yoon KAIST URL: http://jupiter.kaist.ac.kr/~sungeui/

Data Avalanche (or Data Explosions) There are too much data out data!!! www.cs.umd.edu/class/spring2001/ cmsc838b/Project/Parija_Spacco/images/

Geometric Data Avalanche • Massive geometric data • Due to advances of modeling, simulation, and data capture techniques • Time-varying data (4D data sets)

CAD Model: Double Eagle Oil Tanker 82 million triangles (4 gigabyte)

CAD Model: Boeing 777 Ray Tracing Boeing 777, 470 million triangles Excerpted from SIGGRAPH course note on massive model rendering

Scanned Model: ST. Matthew Model 372 million triangles (10GB) www.cyberware.com

Possible Solutions? • Hardware improvement will address the data avalanche? • Moore’s law: the number of transistor is roughly double every 18 months

Current Architecture Trends Data access time becomes the major computational bottleneck! Accumulated growth rate during 1999~2009 (log scale) access speed disk access speed

Four Orthogonal Approaches • Cache-coherent layouts • Random-accessible compressed meshes • Cache-oblivious ray reordering • Hybrid parallel continuous collision detection

Overview • Cache-coherent layouts • Random-accessible compressed meshes • Cache-oblivious ray reordering • Hybrid parallel continuous collision detection

va vc vb vd va vb vd vc Cache-Coherent Layouts of Meshes • One dimensional data layout of a mesh • Reduce the number of cache misses • Cache-aware or cache-oblivious layouts • Minimize the number of cache misses for a specific or various cache parameters (e.g., cache block size) One dimensional layout [Yoon et al. SIG05, VIS06, Euro06]

Block transfer Block-based I/O Model [Aggarwal and Vitter 88] Fast memory or cache Slow memory CPU or GPU Disk 1 sec 10-6 sec 10-4 sec Access time:

Applications • View-dependent meshes • View-dependent rendering • Triangle meshes • Isocontour extractions • Hierarchies • Ray tracing • Collision detection

View-Dependent Rendering using LODs Improving GPU vertex cache Utilization GeForce 6800 (January 2005)

Applications • View-dependent meshes • View-dependent rendering • Triangle meshes • Isocontour extractions • Hierarchies • Ray tracing • Collision detection Puget sound, 134 M triangles Isocontour z(x,y) = 500m • Achieve up to 20X improvement on iso-contouring

Applications • View-dependent meshes • View-dependent rendering • Triangle meshes • Isocontour extractions • Hierarchies • Ray tracing • Collision detection • Achieve 30% ~ 300% performance improvement

Advantages • General • Works well for various applications • Cache-oblivious • Can have benefit for all levels of the memory hierarchy (e.g. CPU/GPU caches, memory, and disk) • No modification of runtime applications • Only layout computation Source codes are available as a library called OpenCCL

Overview • Cache-coherent layouts • Random-accessible compressed meshes • Cache-oblivious ray reordering • Hybrid parallel continuous collision detection

Random-Accessible Compressed Data • Compression methods of meshes and hierarchies • Reduce the memory requirements • Supports random accesses on meshes and hierarchies • Can be useful to many different applications [Kim et al. Tech. Report 09; Kim et al., TVCG 09; Yoon and Lindstrom, VIS 07]

Hierarchical-Culling oriented Compact Meshes (HCCMeshes) • Consists of two parts: • i-HCCMeshes (in-core representation) • o-HCCMeshes (out-of-core representation)

Data Access Framework Main memory Data pool Request User Data

Data Access Framework- Out-of-Core Technique Main memory Cached data External drive Cluster c0 cluster ID Cluster c1 Request Cluster c2 Data pool User Cluster c3 Data Cluster c4 cluster Cluster c5 … Cluster cn

HCCMeshes Support hierarchical random access! Main memory Cached data External drive cluster ID Request Data pool User Decomp. Data Cluster c0 Compressed Data Cluster c1 Cluster c2 Cluster cm compressed cluster Cluster c3 Decomp. cluster Cluster c4 Cluster c5 Cluster c6 Cluster c7 o-HCCMesh i-HCCMesh Cluster c8 Cluster c9 Cluster c10 Cluster c11 Cluster c12 Cluster c13 …

Main Benefits • Use a lower memory space and working set size • o-HCCMeshes have 20:1 compression ratios • i-HCCMeshes have 6:1 compression ratios • Improve runtime performance

Applications • Whitted-style ray tracing • LOD-based ray tracing • Collision detection • Photon mapping • Non-photorealistic rendering Source codes are available as OpenRACM

Results

Overview • Multi-resolution representations • Random-accessible compressed meshes • Cache-oblivious ray reordering • Hybrid parallel continuous collision detection

Challenges • Secondary rays generated show low ray coherence • Result in low cache utilizations • In case of ray tracing massive models, expensive cache misses occur (e.g. L1/L2, main memory) Landscape ( >1000 M ) St.Matthew ( 372 M )

Goal • Design an efficient algorithm for converting incoherent secondary rays to coherent • Achieve a high cache coherence of these rays • The performance improvement of ray tracing

Ray Reordering Framework Hit points and material information Caches Ray generation Ray processing Ray reordering Camera information L1 Main memory Disk Scene information Ray buffer [Moon et al., under review]

Applications • Path tracing • Photon mapping

Result – Path Tracing (Video) • 104 M triangles • (12.8 GB) • 512*512 resolution • 100 path • 8 area lights

Result – Photon Mapping • 128 M triangles • (15.7 GB) • Cache 19% of all the data • 4 area lights • 13 X speedup

Overview • Multi-resolution representations • Random-accessible compressed meshes • Cache-oblivious ray reordering • Hybrid parallel continuous collision detection

Collision Detection • Collision detection is used in various fields • Game, movie, scientific simulation and robotics <Figure from C. Lauterbach > <Figure from PIXAR> <Figure from AION >

Discrete VS Continuous Discrete collision detection (DCD) Time step (i) Time step (i-1)

Discrete VS Continuous Continuous collision detection(CCD) Time step (i) Time step (i-1)

Discrete VS Continuous Discrete collision detection (DCD) ? Time step (i) Time step (i-1)

Discrete VS Continuous

Motivation • Continuous collision detection • Accurate, but slow for complex models • Hardware trend • CPUs and GPUs are increasing the # of cores • Heterogeneous architectures • Intel Larabee architecture • Previous approaches • Utilize either multi-core CPUs or GPUs • Not enough performance for interactive applications

Hybrid Parallel CCD [Kim et al. PG 09] • Takes advantages of both: • Multi-core CPU architectures • GPU architectures • Achieves interactive performance for various deforming models consisting of tens or hundreds of thousand triangles GPU Multi-core CPU Multi-core CPU CCD … … GPU

Results • Performance of HPCCD utilizing both CPUs and GPUs Source codes are available as a library called OpenCCD

Results

Conclusions • Data explosion and lower growth rate of data access time • Discussed three different techniques as a data management method • Cache-coherent layouts • Random-accessible compressed data • Cache-oblivious ray reordering • Hybrid continuous collision detection • Applied to rendering and collision detection • Observed meaningful performance improvement

Acknowledgements • Research collaborators • TaeJoon Kim, DukSu Kim, Pio Claudio, BooChang Moon, YongYoung Byun, JaePil Heo, SeungYong Lee, YongJin Kim, JaeHyuk Heo, John Kim, Peter Lindstrom, Valerio Pascucci, Dinesh Manocha • Funding sources • Microsoft Research Asia • KAIST seed grant • Ministry of Knowledge Economy • Samsung • Korea Research Foundation

Data Management Techniques

Data Management Techniques

Presentation Transcript

Data Recovery Techniques

Data Testing Techniques

Data Collection Techniques

Data Mining Techniques

DATA COLLECTION TECHNIQUES

Effective Data Management Techniques - In the view of Stream data

Welcome to Techniques in Data Management!!

Data Collection Techniques

Data Compression Techniques

Processing and management techniques for multibeam data

Data-collection techniques

Approximation Techniques for Data Management Systems

Data Gathering Techniques

Data Display Techniques

Management Techniques

Management Techniques

Management Techniques

Data Stream Management Techniques

Data Management Techniques for Smartphone Networks

Data Mining Techniques

DATA COLLECTION TECHNIQUES