310 likes | 413 Views
Probabilistic Similarity Search for Uncertain Time Series. Presented by CAO Chen 21 st Feb, 2011. Outline. Introduction Background Time Series Similarity Search Motivation & Contribution Uncertain Time Series Query Uncertainty Approximation Step-wise Refinement Evaluation
E N D
Probabilistic Similarity Search for Uncertain Time Series Presented by CAO Chen 21st Feb, 2011
Outline • Introduction • Background • Time Series • Similarity Search • Motivation & Contribution • Uncertain Time Series Query • Uncertainty Approximation • Step-wise Refinement • Evaluation • Related Literature Review • Q & A CAO Chen, DB Group, CSE, HKUST
Background – Time Series CAO Chen, DB Group, CSE, HKUST
Background – Time Series (cont’d) • Source of Time Series Data • Traffic measurements • Uncorrelated • Location tracking of moving objects • Measuring environmental parameter(temperature) • Correlated CAO Chen, DB Group, CSE, HKUST
Background – Similarity Search • Similarity Search • Pattern Matching • Shape Matching CAO Chen, DB Group, CSE, HKUST
Background – Similarity Search (cont’d) • Range Query • Return all tuples that fits between an upper and lower boundary. • We don’t know how many it will return • Slower than top-k because no upper bound to prune • Sequence Matching • Whole matching: Sequences with same length • Subsequence Matching CAO Chen, DB Group, CSE, HKUST
Motivation & Contribution • Uncertainty • Moving objects • Object identification • Sensor network monitoring CAO Chen, DB Group, CSE, HKUST
Motivation & Contribution (cont’d) • Contribution • (Firstly) Formalize the notion of uncertain time series • Two novel types of probabilistic range queries over uncertain time series • Pruning strategy based on approximating representation of uncertainty • Explicitly evaluate the refinement(processing) time cost CAO Chen, DB Group, CSE, HKUST
Outline • Introduction • Background • Time Series • Similarity Search • Motivation & Contribution • Uncertain Time Series Query • Uncertainty Approximation • Step-wise Refinement • Evaluation • Related Literature Review • Q & A CAO Chen, DB Group, CSE, HKUST
Probabilistic Queries Over Uncertain TS • Definition of Uncertain Time Series CAO Chen, DB Group, CSE, HKUST
Probabilistic Queries Over Uncertain TS (cont’d) • Definition of Uncertain Lp-Distance CAO Chen, DB Group, CSE, HKUST
Probabilistic Queries Over Uncertain TS (cont’d) • Definition of Probabilistic Range Queries CAO Chen, DB Group, CSE, HKUST
Challenge in Processing Range Queries with Uncertainty • Naïve Solution • Computing all distance observations • CPU-bound vs. I/O bound • Long time series and high sample rates (large n), • Naïve Solution • Number of computing the distance CAO Chen, DB Group, CSE, HKUST
Outline • Introduction • Background • Time Series • Similarity Search • Motivation & Contribution • Uncertain Time Series Query • Uncertainty Approximation • Step-wise Refinement • Evaluation • Related Literature Review • Q & A CAO Chen, DB Group, CSE, HKUST
Approximate Representation CAO Chen, DB Group, CSE, HKUST
Approximate Representation (cont’d) • Two Levels of Appr. Representation • Different in whether existing multiple(K) groups of sample observation in one time slot CAO Chen, DB Group, CSE, HKUST Only one group at each time slot By K-means clustering
Distance Approximations CAO Chen, DB Group, CSE, HKUST
Distance Approximations (cont’d) CAO Chen, DB Group, CSE, HKUST
Distance Approximations (cont’d) • Lemma 1 • Lemma 2 CAO Chen, DB Group, CSE, HKUST
Probabilistic Bounded Range Queries (PBRQ) CAO Chen, DB Group, CSE, HKUST True Hit True Drop
Outline • Introduction • Background • Time Series • Similarity Search • Motivation & Contribution • Uncertain Time Series Query • Uncertainty Approximation • Step-wise Refinement • Evaluation • Related Literature Review • Q & A CAO Chen, DB Group, CSE, HKUST
Step-Wise Refinement • When to refine? • Time series that could not be filtered or determined simply by comparing the interval of lower and upper bound • Refinement Goal • To identify an uncertain time series as true hit or true drop • Condition to increase the lower bound • Increase of the number of qualified distance CAO Chen, DB Group, CSE, HKUST
Step-Wise Refinement (cont’d) • Refinement heuristics CAO Chen, DB Group, CSE, HKUST
Outline • Introduction • Background • Time Series • Similarity Search • Motivation & Contribution • Uncertain Time Series Query • Uncertainty Approximation • Step-wise Refinement • Evaluation • Related Literature Review • Q & A CAO Chen, DB Group, CSE, HKUST
Evaluation • Benchmark • UCI Time Series Data Mining Archive • CBF, GUN/POINT, CONTROL CHART, OSU LEAF • Uncertainty • Generating samples uniformly distributed around the given exact values • Evaluation • Overall Speed-Up • Refinement Speed-Up CAO Chen, DB Group, CSE, HKUST
Evaluation (cont’d) • Speed-up for Probabilistic Bounded Range Query (PBRQ) CAO Chen, DB Group, CSE, HKUST
Evaluation (cont’d) • Speed-up for Probabilistic Rank Range Query (PRRQ) CAO Chen, DB Group, CSE, HKUST
Evaluation (cont’d) • Speed-up w.r.t. scalability CAO Chen, DB Group, CSE, HKUST
Evaluation (cont’d) • Refinement • S-S: using proposed strategy • R-R: randomly processing for both steps • Logarithm value of required calculations CAO Chen, DB Group, CSE, HKUST
Outline • Introduction • Background • Time Series • Similarity Search • Motivation & Contribution • Uncertain Time Series Query • Uncertainty Approximation • Step-wise Refinement • Evaluation • Related Literature Review • Q & A CAO Chen, DB Group, CSE, HKUST
Q & A • Thank You CAO Chen, DB Group, CSE, HKUST