1 / 17

On random sampling over Joins

On random sampling over Joins. Surajit Chaudhuri Rajeev Motwani Vivek Narasayya. Presented by : Srikantha Nema. Outline. Semantics of Sample Difficulty of join Sampling Algorithms for Sampling Sampling strategies New strategies for join Sampling Experimental evaluation

akiko
Download Presentation

On random sampling over Joins

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On random sampling over Joins SurajitChaudhuri Rajeev Motwani VivekNarasayya Presented by : SrikanthaNema

  2. Outline Semantics of Sample Difficulty of join Sampling Algorithms for Sampling Sampling strategies New strategies for join Sampling Experimental evaluation Conclusions

  3. Terminologies SAMPLE(R, f) is an SQL operation When a query Q is evaluated, we obtain relation R f is a fraction of a relation R

  4. Semantics of Sample Sampling with Replacement (WR) Sampling without Replacement (WoR) Independent Coin Flips (CF)

  5. Difficulty of Join Sampling

  6. Classification of Join Sampling problem • Case A • No information is available for either or • Case B • No information is available for but indexes and /or statistics are available for • Case C • Indexes/statistics are available for and

  7. Algorithms for Sampling • Unweighted Sequential WR Sampling • Black-Box U1 • Black-Box U2 • Weighted Sequential WR Sampling • Black-Box WR1 • Black-Box WR2

  8. Unweighted Sequential WR Sampling Black-Box U1 Black-Box U2

  9. Weighted Sequential Sampling Black-Box WR1 Black-Box WR2

  10. Sampling Strategies (old) Strategy Naïve-Sample Strategy Olken-Sample

  11. New strategies for join Sampling Strategy Stream-Sample Strategy Group-Sample Strategy Frequency-Partition-Sample

  12. Strategy Frequency-Partition-Sample

  13. Experimental Evaluation 1

  14. Experimental Evaluation 2

  15. Experimental Evaluation 3

  16. Conclusions Difficulty of join sampling Classification of the problem into 3 cases Strategies for join sampling New schemes for sequential random sampling for uniform and weighted sampling More efficient strategies can be developed for the case of single join More work needed to understand the problem of sampling the result of join trees

  17. Thank You

More Related