200 likes | 305 Views
Efficient Evaluation of Probabilistic Advanced Spatial Queries on Existentially Uncertain Data. By Man Lung Yiu, Nikos Mamoulis, Xiangyuan Dai, Yufei Tao, and Michail Vaitis. Presented By: John Burich CIS 601 Spring 2009. What the paper is about.
E N D
Efficient Evaluation of Probabilistic Advanced Spatial Queries on Existentially Uncertain Data • By Man Lung Yiu, • Nikos Mamoulis, • Xiangyuan Dai, • Yufei Tao, and • Michail Vaitis • Presented By: John Burich • CIS 601 • Spring 2009
What the paper is about. • Introduces the concept of existentially uncertain spatial data. • Describes some Algorithms to evaluate existentially uncertain spatial data. • Evaluates the performance of the algorithms.
Existentially Uncertain Data • What is it? • A good example would be data points on a map in which it is unclear if the objects represented by those points exist. • Why would you have uncertain Data? • A digital satellite image may have uncertain data for example a point on the image may represent ships at sea, you may have a couple pixels that may represent a ship or it could just be caused by the way the light reflected off of a wave.
What does spatially uncertain data look like? • If the data is representing points on a 2 dimensional map, It would just be the x, y location recorded in the database, along with a probability of existence for each point.
Key terms –Query types Range query. • Given a range or area a range query on spatial data would give all of the points that are in that range. • A range query on spatial data with existentially uncertain data. A range query would return the points in that range and their probabilities.
Key terms –Query Types • NN Query – Nearest neighbor query. • Given a point q a nearest neighbor query on a spatial data would usually return one result unless there are multiple points that are the same distance from q. • A nearest neighbor query on spatial data with existentially uncertain data would return a list of points with their probability of being the nearest neighbor.
Key terms – Query types • SS query – Spatial Skyline query • Spatial dominate - If you have a set of point Q and two other points p and p’, p spatially dominates p’ if p is closer to all points in Q than p’. • Spatial skyline: If you have two point data sets R and Q, R’s spatial skyline with respect to Q contains the objects p that belong to R that are not spatially dominated by any other object in R.
Other key terms • RNN Query – reverse nearest neighbor query. • Thresholding query • Ranking Query • MBR – Minimum Bounding rectangles.
Example Probabilistic NN query. • P7 = .1 • P6 = (1-.1)(.1)=.09 • p8=(1-.1)(1-.1)(.2)=.162
Types of indexing on the database • 2D R-tree • 3D R-tree
List of Algorithms • Probabilistic NN on a 2D R-tree • Probabilistic NN on a 2D R-tree with thresholding • Probabilistic NN on a 2D R-tree with ranking
List of Algorithms on an Augmented 2D R-Tree • Probabilistic NN on an augmented 2D R-tree with thresholding. • Probabilistic NN on an augmented 2D R-tree with ranking
List of Spatial Skyline algorithms on an 2D R-Tree • Probabilistic Spatial Skyline on a 2D R-tree with thresholding. • Probabilistic Spatial Skyline on an augmented 2D R-tree with thresholding
List of RNN Algorithms • Probabilistic RNN on a 2D R-tree with thresholding • Probabilistic RNN on an augmented 2D R-tree with thresholding.
Data set San Joaquin roads (TG) Greece roads (GR) Long Beach roads (LB) LA streets (LA) San Francisco roads (SF) Tiger streams (TS) Size 18263 23268 53145 131461 174956 194971 Experimental Evaluation - Data Used
Experimental Evaluation • Performance of 3D R-tree not necessarly better than the performance of the augmented 2D R-tree.
Experimental Evaluation • NN queries / SF data set used • a) Thresholding t = .005 b) Ranking m = 10
Experimental Evaluation • RNN queries • SF data set used. • a) t = .005 b) Ranking m = 10
Conclusion • The Paper presents many ways of examining spatial queries for existentially uncertain data. • The Methods provided are only for objects with probabilities that are not correlated. So, there is room for other research.
Questions? Questions?