330 likes | 441 Views
Design Considerations for High Fan-in Systems: The HiFi Approach. Michael J. Franklin, Shawn R. Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi, Eugene Wu, Owen Cooper, Anil Edakkunni, and Wei Hong. Presented by Shawn Jeffery CIDR‘05 1/7/05.
E N D
Design Considerations for High Fan-in Systems: The HiFi Approach Michael J. Franklin, Shawn R. Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi, Eugene Wu, Owen Cooper, Anil Edakkunni, and Wei Hong Presented by Shawn Jeffery CIDR‘05 1/7/05 UC Berkeley, Intel Research Berkeley
Itinerary • Introduction: High Fan-in Systems • HiFi Overview • Initial Prototype • Ongoing Work and Future Directions • Conclusions Shawn Jeffery, HiFi Project, UCB EECS
Introduction • Receptors everywhere! • Wireless sensor networks, RFID technologies, digital home, network monitors, ... • Somehow need to make sense of this data to provide near real-time decision support Shawn Jeffery, HiFi Project, UCB EECS
High Fan-in Systems The “Bowtie” • Challenges in 3 dimensions: • Geography • Time • Resources Large numbers of receptors = large data volumes Hierarchical, successive aggregation Shawn Jeffery, HiFi Project, UCB EECS
Supply-Chain Management (SCM) Headquarters Regional Centers Warehouses, Stores Dock doors, Shelves Receptors Shawn Jeffery, HiFi Project, UCB EECS
State of the Art • Not seen as a data management issue • Focus on protocol design • Different “data models” at each level • Reinventing “query languages” at each level • Piecemeal/stovepipe approach • Each type of receptor (RFID, sensors, etc) handled separately • Current solutions tend to be hand-coded, script-based approaches No end-to-end, integrated solution for managing distributed receptor data Shawn Jeffery, HiFi Project, UCB EECS
Itinerary • Introduction: High Fan-in Systems • HiFi Overview • Initial Prototype • Ongoing Work and Future Directions • Conclusions Shawn Jeffery, HiFi Project, UCB EECS
HiFi: Cascading Stream Processing in a High Fan-in System • A data management infrastructure for high fan-in environments • Uniform Declarative Framework • Every node is a data stream processor that speaks SQL-ese stream-oriented queries at all levels • Hierarchical, stream-based views as an organizing principle Shawn Jeffery, HiFi Project, UCB EECS
Hierarchical Query Processing SELECT S.area, AVG(S.temp) FROM SENSOR_STREAM S [range by ‘5 sec’ slide by ‘5 sec’] GROUP BY S.area “I provide national monthly values for the US” • Continuous and Streaming • Windows • Sharing • Hierarchical • Temporal granularity vs. geographic scope “I provide avg weekly values for California” “I provide avg daily values for Berkeley” “I provide raw readings for Soda Hall” Shawn Jeffery, HiFi Project, UCB EECS
DSQP MDR • HiFi Glue • DSQP Management • Query Planning • Archiving • Internode coordination • and communication HiFi Glue HiFi Glue HiFi Glue DSQP DSQP DSQP Basic HiFi Architecture • Hierarchical federation of nodes • Each node: • Data Stream Query Processor (DSQP) • HiFi Glue • Views drive system functionality • Metadata Repository (MDR) Shawn Jeffery, HiFi Project, UCB EECS
In the paper… HiFi Design Considerations • Dealing with Real-World Data • Hierarchical Windowed Views with Sharing • System Management • Topological Fluidity • Query Planning and Data Placement • Complex Event Processing • Archiving and Prioritization • Privacy and Access Control Shawn Jeffery, HiFi Project, UCB EECS
Itinerary • Introduction: High Fan-in Systems • HiFi Overview • Initial Prototype • Ongoing Work and Future Directions • Conclusions Shawn Jeffery, HiFi Project, UCB EECS
Envisioning HiFi Building HiFi Shawn Jeffery, HiFi Project, UCB EECS
A Tale of Two Systems • TelegraphCQ • Data stream processor • Continuous, adaptive query processing with aggressive sharing • TinyDB • Declarative query processing for wireless sensor networks • In-network aggregation Shawn Jeffery, HiFi Project, UCB EECS
Initial Prototype PC TelegraphCQ Stargates TinyDB Sensor Networks & RFID Readers RFID Wrappers Shawn Jeffery, HiFi Project, UCB EECS
Initial Prototype Demoed @ VLDB ‘04 Shawn Jeffery, HiFi Project, UCB EECS
HiFi Design Considerations • Dealing with Real-World Data • Hierarchical Windowed Views with Sharing • System Management • Topological Fluidity • Query Planning and Data Placement • Complex Event Processing • Archiving and Prioritization • Privacy and Access Control • Dealing with Real-World Data • Hierarchical Windowed Views with Sharing • System Management • Topological Fluidity • Query Planning and Data Placement • Complex Event Processing • Archiving and Prioritization • Privacy and Access Control Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi • RFID data is gross! • Lost readings • Errant readings • Duplicate readings • Use queries to make the data usable • CSAVA: Clean Smooth Arbitrate Validate Analyze Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi Clean CREATE VIEW cleaned_rfid_stream AS (SELECT receptor_id, tag_id FROM rfid_stream rs WHERE read_strength >= strength_T) Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi Smooth CREATE VIEW smoothed_rfid_stream AS (SELECT receptor_id, tag_id FROM cleaned_rfid_stream [range by ’5 sec’, slide by ’5 sec’] GROUP BY receptor_id, tag_id HAVING count(*) >= count_T) Clean Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi Arbitrate CREATE VIEW arbitrated_rfid_stream AS (SELECT receptor_id, tag_id FROM smoothed_rfid_stream rs [range by ’5 sec’, slide by ’5 sec’] GROUP BY receptor_id, tag_id HAVING count(*) >= ALL (SELECT count(*) FROM smoothed_rfid_stream [range by ’5 sec’, slide by ’5 sec’] WHERE tag_id = rs.tag_id GROUP BY receptor_id)) Smooth Clean Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi Validate CREATE VIEW validated_tags AS (SELECT tag_name, FROM arbitrated_rfid_stream rs [range by ’5 sec’, slide by ’5 sec’], known_tag_list tl WHERE tl.tag_id = rs.tag_id Arbitrate Smooth Clean Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi Analyze CREATE VIEW tag_count AS (SELECT tag_name, count(*) FROM validated_tags vt [range by ‘5 min’, slide by ‘1 min’] GROUP BY tag_name Validate Arbitrate Smooth Clean Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi Analyze Augment Augment Validate Convert Convert Arbitrate Aggregate Aggregate Smooth Clean Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Bridging the Physical-Virtual Divide • An example of HiFi processing, but instrumental in dealing with real world data Arbitrate Multiple Receptors Smooth Window Clean Single Tuple CSAVA Generalization Shawn Jeffery, HiFi Project, UCB EECS
Complexity of Hierarchical Windowed Query Processing • Naïve dissemination (unchanged query) introduces a lag in query results Shawn Jeffery, HiFi Project, UCB EECS
Result Tuple(s) Result Tuple(s) Result Tuple(s) Additive Lag in Hierarchical Windowed Query Processing User SELECT S.area, AVG(temp) FROM SENSOR_STREAM S [range by ‘5 sec’ slide by ‘5 sec’] GROUP BY S.area Level 2 Window Window Level 1 Level 0 Window Time Event Additive Lag! Shawn Jeffery, HiFi Project, UCB EECS
Result Tuple(s) Result Tuple(s) Result Tuple(s) Time Sketch of a Solution User • Solution is to use both time-based windows and NOW windows SELECT S.area, AVG(temp) FROM SENSOR_STREAM S [range by ‘5 seconds’ slide by ‘5 seconds’] GROUP BY S.area NOW window Level 2 NOW window Level 1 Time-based window Level 0 Window Event Shawn Jeffery, HiFi Project, UCB EECS
System Management • Our small deployment: • 20+ individual devices (4 types of devices) • 5 different platforms (OS + Hardware) Management nightmare • System-wide management is crucial • Both coarse and fine-grained • Where we’re headed: • System monitoring needed: turn the lens inwards to introspect on system state • Use uniform declarative framework to provide failover and load balancing Shawn Jeffery, HiFi Project, UCB EECS
Itinerary • Introduction: High Fan-in Systems • HiFi Overview • Initial Prototype • Ongoing Work and Future Directions • Conclusions Shawn Jeffery, HiFi Project, UCB EECS
Ongoing Work and Future Directions • Bridging the physical-virtual divide • Generalize CSAVA-type processing to other receptors • Hierarchical query processing • Query planning, dissemination • Complex event processing • Unify event and data processing • System deployment and management • Archiving and prioritization Shawn Jeffery, HiFi Project, UCB EECS
Conclusions • Receptors everywhere High Fan-In Systems • Uniform declarative framework is the key to building these systems • The HiFi project is exploring this approach • Our initial prototype • Leveraged TelegraphCQ and TinyDB • Validated the HiFi approach • Identified research directions • Broad in scope = much work to be done! Shawn Jeffery, HiFi Project, UCB EECS
Questions? hifi.cs.berkeley.edu