1 / 17

Data-driven Query Processing for Immersive Computational Turbulence

Data-driven Query Processing for Immersive Computational Turbulence. Kalin Kanov Department of Computer Science Johns Hopkins University. The Big Picture. Scientific disciplines have developed a computational branch Models without closed form solutions solved numerically

selene
Download Presentation

Data-driven Query Processing for Immersive Computational Turbulence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data-driven Query Processing for Immersive Computational Turbulence Kalin Kanov Department of Computer Science • Johns Hopkins University

  2. The Big Picture • Scientific disciplines have developed a computational branch • Models without closed form solutions solved numerically • This has lead to an explosion of data • Simulation and analysis workloads are data-intensive • Producing\scanning large amounts of data • Management of these data represents a significant challenge • Storage\archiving • Query processing • Visualization

  3. Remote Immersive Analysis • Formerly, analysis performed during the computation • No data stored for subsequent examination • Data-intensive computing breakthroughs have allowed for new interaction with scientific numerical simulations • Turbulence Database Cluster • Stores entire space-time evolution of the simulation • Provides public access to world-class simulations • Implements “immersive turbulence*” approach • Introduces new challenges *E. Perlman, R. Burns, Y. Li, and C. Meneveau. Data exploration of turbulence simulations using a database cluster. In Supercomputing, 2007.

  4. Goals • Develop data-driven query processing techniques • Reduce I/O and computation costs • Reduce or eliminate storage overhead • Exploit domain knowledge and structure • Provide user interfaces that are efficient and flexible • Streamline the process of data ingest

  5. Turbulence Database Cluster

  6. Processing a Batch Query query 2 10 11 14 15 • Redundant I/O • Multiple disk seeks 8 9 12 13 2 3 6 7 0 1 4 5 query 1 query 3 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 6 8 9 12 q1: 9 11 12 14 q2: q3: 4 5 6 7

  7. I/O Streaming Evaluation Method • Linear data requirements of the computation allow for: • Incremental evaluation • Streaming over the data • Concurrent evaluation of batch queries

  8. Processing a Batch Query query 2 10 11 14 15 • Sequential I/O • Single pass 8 9 12 13 2 3 6 7 0 1 4 5 query 1 query 3 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 11 12 14 I/O Streaming: q1 q1 q1 q1 q1 q3 q1 q3 q1 q1 q2 q1 q2 q2 q3 q3 q2

  9. Lagrange Polynomial Interpolation Lagrange coefficients Data

  10. Spatial Differentiation

  11. Derivative Interpolation

  12. 128 Workload • I/O Streaming • Each atom is read only once • Effective cache usage • Join/Order By executes entire batch as a join • Sorting leads to a more sequential acces • Over an order of magnitude improvement

  13. I/O Streaming alleviates I/O bottleneck • Computation emerges as the more costly operation

  14. Particle Tracking Web Server/Mediator Distribute Points based on xp(tm) xp(tm) DB Node 1 DB Node N x*p(tm) x*p(tm) Computational Module Computational Module Storage Layer Retrieve Storage Layer Retrieve

  15. Particle Tracking Web Server/Mediator x*p(tm) x*p(tm) Distribute Points based on DB Node 1 DB Node N xp(tm+1) xp(tm+1) Computational Module Computational Module Storage Layer Retrieve Storage Layer Retrieve

  16. Summary and Future Work • Extend I/O streaming technique to different decomposable kernel computations: • Differentiation • Spatial Interpolation • Temporal interpolation • Filtering and coarse-graining • Provide a flexible user interface • Allow for different filter functions • Allow for new kernel computations • Improve particle tracking routine • Reduce communication between mediator and DB nodes • Asynchronous processing • Caching and pre-fetching

  17. Questions Images courtesy of Kai Buerger (buerger@tum.de)

More Related