1 / 31

Probabilistic Similarity Search for Uncertain Time Series

Probabilistic Similarity Search for Uncertain Time Series. Presented by CAO Chen 21 st Feb, 2011. Outline. Introduction Background Time Series Similarity Search Motivation & Contribution Uncertain Time Series Query Uncertainty Approximation Step-wise Refinement Evaluation

lyndon
Download Presentation

Probabilistic Similarity Search for Uncertain Time Series

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probabilistic Similarity Search for Uncertain Time Series Presented by CAO Chen 21st Feb, 2011

  2. Outline • Introduction • Background • Time Series • Similarity Search • Motivation & Contribution • Uncertain Time Series Query • Uncertainty Approximation • Step-wise Refinement • Evaluation • Related Literature Review • Q & A CAO Chen, DB Group, CSE, HKUST

  3. Background – Time Series CAO Chen, DB Group, CSE, HKUST

  4. Background – Time Series (cont’d) • Source of Time Series Data • Traffic measurements • Uncorrelated • Location tracking of moving objects • Measuring environmental parameter(temperature) • Correlated CAO Chen, DB Group, CSE, HKUST

  5. Background – Similarity Search • Similarity Search • Pattern Matching • Shape Matching CAO Chen, DB Group, CSE, HKUST

  6. Background – Similarity Search (cont’d) • Range Query • Return all tuples that fits between an upper and lower boundary. • We don’t know how many it will return • Slower than top-k because no upper bound to prune • Sequence Matching • Whole matching: Sequences with same length • Subsequence Matching CAO Chen, DB Group, CSE, HKUST

  7. Motivation & Contribution • Uncertainty • Moving objects • Object identification • Sensor network monitoring CAO Chen, DB Group, CSE, HKUST

  8. Motivation & Contribution (cont’d) • Contribution • (Firstly) Formalize the notion of uncertain time series • Two novel types of probabilistic range queries over uncertain time series • Pruning strategy based on approximating representation of uncertainty • Explicitly evaluate the refinement(processing) time cost CAO Chen, DB Group, CSE, HKUST

  9. Outline • Introduction • Background • Time Series • Similarity Search • Motivation & Contribution • Uncertain Time Series Query • Uncertainty Approximation • Step-wise Refinement • Evaluation • Related Literature Review • Q & A CAO Chen, DB Group, CSE, HKUST

  10. Probabilistic Queries Over Uncertain TS • Definition of Uncertain Time Series CAO Chen, DB Group, CSE, HKUST

  11. Probabilistic Queries Over Uncertain TS (cont’d) • Definition of Uncertain Lp-Distance CAO Chen, DB Group, CSE, HKUST

  12. Probabilistic Queries Over Uncertain TS (cont’d) • Definition of Probabilistic Range Queries CAO Chen, DB Group, CSE, HKUST

  13. Challenge in Processing Range Queries with Uncertainty • Naïve Solution • Computing all distance observations • CPU-bound vs. I/O bound • Long time series and high sample rates (large n), • Naïve Solution • Number of computing the distance CAO Chen, DB Group, CSE, HKUST

  14. Outline • Introduction • Background • Time Series • Similarity Search • Motivation & Contribution • Uncertain Time Series Query • Uncertainty Approximation • Step-wise Refinement • Evaluation • Related Literature Review • Q & A CAO Chen, DB Group, CSE, HKUST

  15. Approximate Representation CAO Chen, DB Group, CSE, HKUST

  16. Approximate Representation (cont’d) • Two Levels of Appr. Representation • Different in whether existing multiple(K) groups of sample observation in one time slot CAO Chen, DB Group, CSE, HKUST Only one group at each time slot By K-means clustering

  17. Distance Approximations CAO Chen, DB Group, CSE, HKUST

  18. Distance Approximations (cont’d) CAO Chen, DB Group, CSE, HKUST

  19. Distance Approximations (cont’d) • Lemma 1 • Lemma 2 CAO Chen, DB Group, CSE, HKUST

  20. Probabilistic Bounded Range Queries (PBRQ) CAO Chen, DB Group, CSE, HKUST True Hit True Drop

  21. Outline • Introduction • Background • Time Series • Similarity Search • Motivation & Contribution • Uncertain Time Series Query • Uncertainty Approximation • Step-wise Refinement • Evaluation • Related Literature Review • Q & A CAO Chen, DB Group, CSE, HKUST

  22. Step-Wise Refinement • When to refine? • Time series that could not be filtered or determined simply by comparing the interval of lower and upper bound • Refinement Goal • To identify an uncertain time series as true hit or true drop • Condition to increase the lower bound • Increase of the number of qualified distance CAO Chen, DB Group, CSE, HKUST

  23. Step-Wise Refinement (cont’d) • Refinement heuristics CAO Chen, DB Group, CSE, HKUST

  24. Outline • Introduction • Background • Time Series • Similarity Search • Motivation & Contribution • Uncertain Time Series Query • Uncertainty Approximation • Step-wise Refinement • Evaluation • Related Literature Review • Q & A CAO Chen, DB Group, CSE, HKUST

  25. Evaluation • Benchmark • UCI Time Series Data Mining Archive • CBF, GUN/POINT, CONTROL CHART, OSU LEAF • Uncertainty • Generating samples uniformly distributed around the given exact values • Evaluation • Overall Speed-Up • Refinement Speed-Up CAO Chen, DB Group, CSE, HKUST

  26. Evaluation (cont’d) • Speed-up for Probabilistic Bounded Range Query (PBRQ) CAO Chen, DB Group, CSE, HKUST

  27. Evaluation (cont’d) • Speed-up for Probabilistic Rank Range Query (PRRQ) CAO Chen, DB Group, CSE, HKUST

  28. Evaluation (cont’d) • Speed-up w.r.t. scalability CAO Chen, DB Group, CSE, HKUST

  29. Evaluation (cont’d) • Refinement • S-S: using proposed strategy • R-R: randomly processing for both steps • Logarithm value of required calculations CAO Chen, DB Group, CSE, HKUST

  30. Outline • Introduction • Background • Time Series • Similarity Search • Motivation & Contribution • Uncertain Time Series Query • Uncertainty Approximation • Step-wise Refinement • Evaluation • Related Literature Review • Q & A CAO Chen, DB Group, CSE, HKUST

  31. Q & A • Thank You CAO Chen, DB Group, CSE, HKUST

More Related