330 likes | 465 Views
Count / Top-k Continuous Queries on P2P Networks. 01/11/2006. Outline. Problem Definition P2P Architecture Count Top-K Experiment Setup Future Work. Streaming Data in P2P. P2P Dynamic changing topology, large scale, … Streaming data Continuous, unbounded, rapid, time-varying, noise
E N D
Outline • Problem Definition • P2P Architecture • Count • Top-K • Experiment Setup • Future Work
Streaming Data in P2P • P2P • Dynamic changing topology, large scale, … • Streaming data • Continuous, unbounded, rapid, time-varying, noise • P2P + Streaming data • Dynamic in both data and topology
Objective and Goal • Objective • Issue a continuous query to estimate count and top-K • Goal • Lower down the communication cost • Lightweight maintenance • Approximated answers • An adaptive and progressive approach
Naïve approach • Flooding the overlay continuous • Pros • Closer to the exact answer • Cons • Network congestion • Still non-real time
The State-of-the-Art • Count • Focus on one-time answer in P2P • Deal with streaming data only • Top-K • P2P environment without streaming data • Distributed environment not P2P
P2P architecture • Assumption • Hierarchical P2P (Focused) • Super-peer hierarchical structure • Query issuer is a super-peer • Super peer connect with other super peers • Each peer belongs to only one super peer • Pure unstructured P2P
Big picture Group Accumulate information within a group based on the constraint and statistics Report changes SetConstraint Approximated answer
Group in hierarchical P2P Coordinator Issuer Peer
Group in hierarchical P2P 1 3 2 4
Group in hierarchical P2P 1 3 2 3 4 4
Group in hierarchical P2P 1 3 2 3 4 4
After partition Assume we have N objects and K Groups after partition Group1 Group3 Group2
User-specified Epsilon Group1 User-specifiedε(Precision) Group3 Group2
Consider a group O1 O2 O3 P2 P3 Objects P1 P4 Node Coordinator
Each node maintain the distribution information of owning objects # R2 Rate R3 P2 P3 object P1 R4 P4 R1
At initial - Polling P2 P3 P1 P4 Node Coordinator
At initial - Polling P2 P3 P1 P4 Node Coordinator
Information at coordinator after polling 26 # 33 22 P2 P3 object P1 P4
Statistics information Estimated value Change value for each object Latest real value 26 # 33 P1 P2 P3 P4Δ O1 1/1 6/6 10/10 5/5 22 O2 11/11 13/13 5/5 9/7 36 O3 15/15 6/6 3/3 9/9 33 R 0.3 0.2 -0.05 0.6 T 15 15 17 13 22 Updated time stamp object Maximum changing rate(+/-) of objects in each peer
Update to Coordinator (Δ13, Δ23, Δ33) (Δ11, Δ21, Δ31) (Δ12, Δ22, Δ32) T2
Redistribute Epsilon wi=Max(Δi)/Cx,0 where x is the i-index of Max(Δi) δi=wiεCx,0/ ∑wi
Visiting sequence P2 P3 Pick those peers would violate δ P1 P4
Update information P1 P2 P3 P4Δ O1 1/1 6/6 10/10 8/8- O2 11/11 11/11 5/5 6/6- O3 15/15 5/5 3/3 11/11- R 0.3 0.4 -0.05 0.2 T 15 30 17 33 Group
For those nodes not being visited P1 P2 P3 P4Δ O1 1/26/6 10/98/8 25 O2 11/1311/11 5/46/6 34 O3 15/185/5 3/211/11 36 R 0.3 0.4 -0.05 0.2 T 15 30 17 33 Group
Un-notified Leave P2 P3 Ping P1 is dead P1 P4 Remove P1’s information
Experiment Setup • Generate synthetic data set by statistics distribution for • Streaming data • Life time of peers • Metrics • Message size • Communication cost • Response latency • Result accuracy
Top-K • Use Regression to predicate the reasonable trend of changes • Once a updated result is required, Super Peer only need to ask those doubtfulpeers for doubtfulobjects • Update its counting list, and return the top k objects
Future Work • Connect and recommend latent good friends for each user • Good friends: the ones with the same interests (behaviors) • Exploiting current connecting peers to discover good friends bit by bit • Design a system that could make clusters reflecting current interests of individual peers and connecting them together based on their similarity by using user’s social network
Advantages • Reduce search time and diminish query traffic by using friends list • By utilizing their different strength of arcs/edges/ties = friendshipness, social networks exceed random-walk networks in quickly finding target objects
Example Level 1 Level 2
Example has larger weight than Score(Ni) Similarity Score(Ni)