110 likes | 226 Views
The Threshold Join Algorithm for Top-k Queries in Distributed Sensor Networks. D. ZeinalipourYazti, Z. Vagena, D. Gunopulos, V. Kalogeraki, V. Tsotras, M. Vlachos, N. Koudas, D. Srivastava. Cathy Wang 05-04-2006. Outline. Introduction Problem Definition The TJA Algorithm Conclusion.
E N D
The Threshold Join Algorithm for Top-k Queries in Distributed Sensor Networks D. ZeinalipourYazti, Z. Vagena, D. Gunopulos, V. Kalogeraki, V. Tsotras, M. Vlachos, N. Koudas, D. Srivastava Cathy Wang 05-04-2006
Outline • Introduction • Problem Definition • The TJA Algorithm • Conclusion
Introduction • Works for distributed sensor networks • Finds the k highest ranked answers • Minimizes the number of tuples to be transferred • Resolves queries in the network – minimize the consumption of bandwidth and delay
Problem Definition • R: n attributes (sensors) each featuring m objects • G(V, E): network graph that interconnects the n vertices in V using the edge set E. V1 V2 V4 V3 V5 The local scores of five objects o1..o5 which are located at nodes v1..v5 network graph G
Problem Definition • Q = (q1, q2, . . . , qn), a top-k query with n attributes. • Score function – monotone: ex: o1:(s1=100F, s2=90F, s3=80F) and o2:(s1=100F, s2=70F, s3=80F), wj=1, sim(qj, oij) represents the percentage ofsimilarity to the most similar object in dimension j. The top-1 object to the query Q=(max(temp), max(temp), max(temp)), would be o1 because Score(o1)=3.0 (i.e. 1*1.0 + 1*1.0 + 1*1.0) and Score(o2)=2.77 (i.e. 1*1.0 + 1*0.77 + 1*1.0) wj: weight factor
The TJA Algorithm • Three phases: • Lower Bound phase – construct a threshold • Hierarchical Joining phase – each node eliminates objects below the lower bound, and joins qualifying objects from children nodes • Clean-Up phase – actual top-k results are identified
U 1,2,3,4,5 2,3,4,5: U U 4 5 2,3 4,5 4,5: 3: Empty Oij Occupied Oij 5: Lower Bound Phase • Identify a set of objects that are used to construct a threshold • list(vi): descending similarity ordered elements of node vi • listk(vi): k local highest ranked objects of list(vi) • L(vi): partial lower bound: • Complete lower bound: LqueryNode=Ltotal={l1, l2,…, lo}, o ≥ k Ltotal {1,3} V1 V2 V3 V4 Ex: Find the time moment with the highest average temperature V5
U 1,2,3,4,5 U U 4 5 2,3 4,5 4,5: 3: 5: Hierarchical Joining Phase • Propagate Ltotal to all nodes in the network • Each node vi search list(vi), and identify the lowest ranked object (idx) belong to Ltotal. • Objects above idx are candidates listidx(vi) • Forward listidx(vi) to parent if vi is a leaf node, else • Receives listidx(vj) from its children, and get a partial result: • Superset of the final top-k result: RqueryNode=Rtotal={r1, r2,…, ro}, o ≥ k ex: Rtotal={(O1, 3.63),(O3, 4.05),(O’4, 3.54)} + Rtotal {1,3,4} V1 2,3,4,5: + V2 + V3 V4 Empty Oij V5 Occupied Oij Occupied Oij
Clean-Up Phase • If objects have upper bound higher than the k-th complete result, compute the exact scores of these objects by: • request exact score from its children • objectR’(vi): fetch all objects in R’. • join lists from children and get the full score for each object in R’, Ctotal. • get Ctotal, and compute the final top-k answers.
Conclusion This paper • studies the problem of finding the k highest rank answers to user query in a sensor network environment. • uses a fixed number of phases. • deploys in-network aggregation to minimize the utilization of the network.
Thank You! Have a great break!