130 likes | 248 Views
Republishers in a Publish/Subscribe Architecture for Data Streams. Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh 6 th July 2005. Overview. Motivation Publish/Subscribe Architecture Query planning. Motivation.
E N D
Republishers in a Publish/Subscribe Architecture for Data Streams Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh 6th July 2005
Overview • Motivation • Publish/Subscribe Architecture • Query planning A.J.G. Gray and W. Nutt BNCOD22
Motivation Scenario: • Streams generated by distributed sensors • Users are also distributed • Use data integration to match users to streams For example, • Grid monitoring for logging and bookkeeping • Sensor networks Bookkeeping Job progress Grid Monitoring data A.J.G. Gray and W. Nutt BNCOD22
Data Streams as Relations • Sensor readings can be viewed as • tuples • conforming to a relational schema • Example: Network ThroughPut A.J.G. Gray and W. Nutt BNCOD22
Consumers Producers Publish/Subscribe Architecture • Local as View Approach • Consumers pose a query over the schema to request streams • Producers describe their stream using a view on the schema • Queries and views are selections over a single relation Registry Data Streams A.J.G. Gray and W. Nutt BNCOD22
Query Planning: Consumer Query C: from = 'hw' Λ psize ≥ 1024 Problem: Approach does not scale to hundreds of producers and consumers. S1: from = 'hw' Λ tool = 'udp' S2: from = 'hw' Λ tool = 'ping' S3: from = 'ral' Λ tool = 'ping' S4: from = 'ral' Λ tool = 'udp' A.J.G. Gray and W. Nutt BNCOD22
Republishers Provide Scalability C: from = 'hw' Λ psize ≥ 1024 R3: TRUE R1: from = 'hw' R2: from = 'ral' S1: from = 'hw' Λ tool = 'udp' S2: from = 'hw' Λ tool = 'ping' S3: from = 'ral' Λ tool = 'ping' S4: from = 'ral' Λ tool = 'udp' A.J.G. Gray and W. Nutt BNCOD22
Plans Need to be Maintained • Queries are long lived • Set of publishers can change • Query plans should reflect changes • What happens when we • add a republisher? • remove a republisher? A.J.G. Gray and W. Nutt BNCOD22
Republishers disconnected from Producers • Cycle in data flow Problem: Adding a Republisher: 1st Attempt R3: TRUE Replan other queries Maximal relevant Relevant publishers Adding a new publisher Replan R3 R1: from = 'hw' R4: tool = 'ping' R2: from = 'ral' S1: from = 'hw' Λ tool = 'udp' S2: from = 'hw' Λ tool = 'ping' S3: from = 'ral' Λ tool = 'ping' S4: from = 'ral' Λ tool = 'udp' A.J.G. Gray and W. Nutt BNCOD22
Desirable Properties for a Hierarchy • Correctness: streams answer queries • Cycle freeness: loops can lead to duplicates • Uniqueness: hierarchy defined for a set of publishers • Local planning: Publishers and Consumers only need to communicate with the Registry A.J.G. Gray and W. Nutt BNCOD22
Adding a Republisher: 2nd Attempt C: from = 'hw' Λ psize ≥ 1024 R3: TRUE Relevant publishers R1: from = 'hw' R4: tool = 'ping' R2: from = 'ral' S1: from = 'hw' Λ tool = 'udp' S2: from = 'hw' Λ tool = 'ping' S3: from = 'ral' Λ tool = 'ping' S4: from = 'ral' Λ tool = 'udp' A.J.G. Gray and W. Nutt BNCOD22
Removing a Republisher C: from = 'hw' Λ psize ≥ 1024 R3: TRUE R1: from = 'hw' R4: tool = 'ping' R2: from = 'ral' S1: from = 'hw' Λ tool = 'udp' S2: from = 'hw' Λ tool = 'ping' S3: from = 'ral' Λ tool = 'ping' S4: from = 'ral' Λ tool = 'udp' A.J.G. Gray and W. Nutt BNCOD22
Conclusions • Republishers: • Allow system to scale • Complicate query answering problem • Republishers require special planning • We have developed algorithms that allows the system to adapt to changes in the set of publishers • Full details available in HW Technical Report www.macs.hw.ac.uk:8080/techreps/view_record.jsp?id=0031 A.J.G. Gray and W. Nutt BNCOD22