1 / 13

Interactive Dynamic Aggregate Queries

Interactive Dynamic Aggregate Queries. Kenneth A. Ross, Junyan Ding Columbia University. Research Experience. Complex query processing Data Warehousing Main memory databases. Students: Kazi Zaman, Junyan Ding. Outline. Past work on Datacubes Scenario Overview

phong
Download Presentation

Interactive Dynamic Aggregate Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interactive Dynamic Aggregate Queries Kenneth A. Ross, Junyan Ding Columbia University DGRC Evaluation Board Meeting

  2. Research Experience • Complex query processing • Data Warehousing • Main memory databases Students: Kazi Zaman, Junyan Ding DGRC Evaluation Board Meeting

  3. Outline • Past work on Datacubes • Scenario Overview • Performance Goals • Technical Details • Conclusion DGRC Evaluation Board Meeting

  4. Datacubes State Year Grade Sales State Year Grade Sales CA 1997 Regular 90 CA 1997 Regular 90 CA 1997 ALL 90 NY 1997 Premium 70 ALL 1997 Regular 90 CA 1998 Premium 65 CA ALL Regular 90 CUBE BY (sum Sales) ALL 1997 Regular 90 NY 1998 Premium 95 ALL 1997 ALL 160 ALL ALL Regular 90 CA ALL ALL 155 ALL ALL ALL 320 Additional records Large increase in total Size, especially with many dimensions ……. DGRC Evaluation Board Meeting

  5. Level-2 Store Level-1 Store Finest granularity cuboid Query q records in linked lists Selected coarse records in hash table Slot directory Two-Level Framework DGRC Evaluation Board Meeting

  6. Datacube Implementation Issues • Fast query response (2-4 ms/query), but • Rigid: • Need to know dimensions and aggregate functions in advance for precomputation. • Not appropriate when hundreds (PUMS) or thousands (NHANES) of dimensions are available. DGRC Evaluation Board Meeting

  7. Scenario Outline User Web Data Request Dynamic Query Mediator ... Graphical User Interface UnifiedResults TraditionalDBMS Dynamic Query Engine Data Files e.g., PUMS DGRC Evaluation Board Meeting

  8. Engine Decoupled from Interface • Can use a variety of interfaces • Multiple connections to one server • Can “do one thing well” • Client/Server parallelism • Abstract interaction via API DGRC Evaluation Board Meeting

  9. Engine Performance Goals • Interactivedata exploration • Millions of records • Thousands of columns (but look at ten or so at a time) • Aggregates and statistical measures • Fine adjustments at 30 answers/second. DGRC Evaluation Board Meeting

  10. Technical Details • Main Memory Implementation • Multidimensional tree structures • Cache consciousness • Branch Misprediction • SIMD • Asynchronous work DGRC Evaluation Board Meeting

  11. Initial Results • Preliminary implementation • tens of answers/second on PUMS 1% data (>2 million records) • “Vanilla” user interface under construction DGRC Evaluation Board Meeting

  12. Conclusions and Plan • First step towards dynamic aggregate queries • More general dependence on parameters • More ambitious user interfaces DGRC Evaluation Board Meeting

  13. Plan • Integrate with user interface to generate dynamic queries. • Self-tuning capability. • Multiple data sets. • Work with agencies to generate value • For intra-agency analysis • For enhanced data dissemination DGRC Evaluation Board Meeting

More Related