320 likes | 506 Views
A new model and architecture for data stream management. Aurora. Data Stream Management. Why on earth would one need it?. The Problem: Tokyo Traffic Control. Stream Processing for Traffic Control. 24-hour real-time control 1.000 traffic intersections 15.154 traffic signals Input Cameras
E N D
A new model and architecture for data stream management Aurora
Data Stream Management Why on earth would one need it?
Stream Processing for Traffic Control • 24-hour real-time control • 1.000 traffic intersections • 15.154 traffic signals • Input • Cameras • Helicopters • Police • Citizen reports • 17.000 vehicle detectors • Onboard vehicle sensors • Traffic jams, accidents & closed streets • Output • Central monitors • 300 traffic information boards • Digital speed signs • Route signs • Affectors • Adjusted traffic signal lights (7.000) • Communications with officers on site
Example Domains • Smart Energy Grid Management • Network Traffic Management • System Monitoring • Road Traffic Monitoring • Military Logistics • Online Auctions • Habitat Monitoring • Immersive Environments
Stream Processing Engines • HADP vs DAHP • Events & Triggers • Continuous Queries • Real-time processing • Transient data • Lossy information
Aurora Overview
The Topic • Aurora • The prototype • DBMS / SPE / DSMS • UI • The query language • The project • The authors
The Authors • M.I.T. , Department of EECS and Laboratory of Computer Science • Michael Stonebraker • Brandeis University, Department of Computer Science • Daniel J. Abadi • Mitch Cherniack • Brown University , Department of Computer Science • Don Carney • Uğur Çetintemel • Christian Convey • Sangdon Lee • Nesime Tatbul • Stan Zdonik
Talk Overview • Stream Processing Engines • SQuAl • Runtime • Related work
Aurora SQuAl (Stream Query Algebra)
SQuAl Overview • Connection Points • Models • Continuous Query • View • Ad-hoc Query • Operators • Order-agnostic • Order-sensitive
SQuAl Operators • Order-agnostic • Filter • Map • Union • Order-sensitive • BSort • Aggregate • Join • Resample • Quirks!
Aurora Runtime
Query Optimization • Dynamic Continuous Query Optimization • Inserting projections • Combining boxes • Reordering boxes • Ad-hoc query optimization
Real-time Scheduling • Timestamped Tuples • Train scheduling • Interbox nonlinearities • Intrabox nonlinearities • Superboxes • Introspection • Static • Run-time
Handling overload • QoS specifications • Response times • Tuple drops • Values produced • Load Shedding • Not Implemented at the time
Aurora Related work
Related work • STREAM • Stanford University, 2000-2006 • Telegraph • UC Berkley, 2000-2007? • SASE • UC Berkley / Mass Amherst, 2006-2008? • Cayuga • Cornell University, 2005-2007? • PIPES • University of Marburg, 2003-2007? • NiagaraCQ • University of Wiscon-Madison, 1999-2002
Complex Event Processing Today • Oracle • Oracle CEP • Microsoft • MS SQL Server StreamInsight • Open Source • OpenPDC • Aleri • Coral8 • TruViso • StreamBase • Aurora’s Grandchild • IBM • SPADE • Active Middleware Technology
Summary • SPEs address different problems • e.g. dynamic realtime monitoring • Data Active, Human Passive • Realtime, transient, even lossy data • Aurora evolved into StreamBase • SQuAl evolved into StreamSQL • Many production-quality alternatives
Resample (Ordered) • Based on RRDTool’s philosophy? • Paper: • Simple interpolation • Use The Force, Read The Source: • Average • Count • Sum • Max • Min • LastVal