220 likes | 342 Views
QoS Supported Clustered Query Processing in Large Collaboration of Heterogeneous Sensor Networks. Debraj De and Lifeng Sang Ohio State University . Workshop on Distributed Collaborative Sensors Networks DCSN’09 (part of CTS’09) May 18 – May 22, 2009. Outline.
E N D
QoS Supported Clustered Query Processing in Large Collaboration of Heterogeneous Sensor Networks Debraj De and Lifeng Sang Ohio State University Workshop on Distributed Collaborative Sensors Networks DCSN’09 (part of CTS’09) May 18 – May 22, 2009
Outline • Collaborative Heterogeneous Sensor Networks (CHSN) • Problem definition • Existing work • Contribution • Simulation results • Discussion
Collaborative Heterogeneous Sensor Networks (CHSN) System • CHSN: collaboration of deployed sensor networks worldwide. • Networks in CHSN are reachable among themselves. • NASA’s Sensor Web, Sensor Web Enablement (SWE).
Problem Definition • User, Portal, Central Manager (CM), Sensor Network Fabrics. • How to process streams of different kinds of queries efficiently?
Implication Factor (IF) between two sensor networks • If SN1 is queried then need to query SN2? • Indicates correlation between phenomena measured by two sensor networks Normal detection Less need to query Camera Sensor Examples of higher correlation and lower valuedImplication Factor Fire detected More need to query sensor for wind speed/direction
Cost of Query Processing • Implication Factor and Cost of Query Processing
Existing Work • Pipelined processing of Query • Cost of Query Processing (CQP) • Implication Factor (IF) • IAP or Implication Aware Processing: Greedy Algorithm for finding sub-optimal solution for order of networks to be queried.
Existing Work (Contd.) Disadvantage • Simplistic Cost Model • Large delay for pipelined query processing • No special feature for query streams Improvement • Query specific support: delay/energy/reliability • Notion of QoS • How to process streams of so different kinds of queries efficiently? • Efficiency – Concurrency – Fairness – Scalability
Contribution • QoS specific model for Cost of Query Processing • Clustered Query Processing algorithm • Handling stream of queries
A. Proposed QoS based Query Cost Model • Cost model of processing a single sensor network • Notion of Quality of Service (QoS) • Minimum energy • Maximum reliability • Minimum delay • Query is classified in CM for QoS support
A. Proposed QoS based Query Cost Model (contd.) • Energy Consumption based cost model: • Reliability based cost model: • Delay based cost model:
B. Proposed Clustered Query Processing algorithm • Fully Pipelined query: large delay cost • Fully Parallel query: large energy cost • Compromise: something between fully pipelined and fully parallel • CHSN Graph G = (V, E, W, Y): directed graph G with edge weight and node weight • each node in V is a network fabric • each node has some weight • weight of each directed edge in E is (1-Implication Factor)
B. Proposed Clustered Query Processing algorithm (contd.) • CHSN Graph Partitioning problem: To determine the best partition V = P1 U P2 U P3..... U Pn, such that: a. The sum of the weights of the edges that connect any two different partitions is minimized. b. For all 1 ≤ i ≤ n, |Pi| ≤ K for some fixed K. (|Pi| = sum of weights of the vertices in the partition Pi, K = a defined upper bound.)
B. Clustered Query Processing algorithm (contd.) • Solution: Constrained Graph Partitioning algorithm by Karypis and Kumar. • Advantage: a. efficient partitioning b. support for partition size constraint c. fast
How the proposed system works • Query forward to CM. • Query classification to choose cost model. • Clustered query processing • Response aggregation
C. Handling stream of queries • Highly efficient for processing stream of queries in real world. a. Clustering in soft state different concurrent clustering for independent queries. b. Reduce cluster size less latency for processing. c. To solve Fairness issue: (i) blocking (ii) popular networks in different clusters. • Proposed system is flexible enough to support varied real world issues and requirements.
Simulation result • Total 50 sensor networks, 5 different queries. • Three sets of simulation study with different implication profiles: • Normal scenario (0 ≤ IF ≤ 1) • Networks with high correlation (0 ≤ IF ≤ 0.3) • Networks with low correlation (0.7 ≤ IF ≤ 1) • Comparison algorithms: • Parallel : completely parallel querying. • Random: random order of querying. • IAP: Greedy based algorithm. • Cluster: proposed algorithm – implication aware clustering and then IAP in each cluster.
Simulation result (contd.) Energy Cost of total query • IAP is best, Parallel is worst. • Cluster is worse than IAP by some margin, but way better than Parallel. IF [0, 1] IF [0, 0.3] IF [0.7, 1]
Simulation result (contd.) Delay Cost of total query • Parallel is best, random is worst. • Cluster is worse than Parallel by some margin, but way better than others. • Overall, performance of Cluster is a good compromise. IF [0, 1] IF [0, 0.3] IF [0.7, 1]
Discussion • A Hybrid Push-Pull Model will Improving performance of query processing in each individual Network • Efficient Query Injection or query diffusion across networks for improving query processing performance • Addressing Context Awareness of Query structure: Local and Global • Wide Area Human Centric Search using Clustered Query Processing (locality based clustering)
Conclusion • A QoS supported Cost Model • Implication-Aware clustered query processing algorithm • Flexibility and support for stream of queries
Questions…… THANK YOU !