250 likes | 261 Views
Heartbeat Mechanism and its Applications in Gigascope. Vladislav Shkapenyuk (speaker), Muthu S. Muthukrishnan Rutgers University Theodore Johnson Oliver Spatscheck AT&T Labs – Research. Unblocking streaming operators.
E N D
Heartbeat Mechanism and its Applications in Gigascope Vladislav Shkapenyuk (speaker), Muthu S. Muthukrishnan Rutgers University Theodore Johnson Oliver Spatscheck AT&T Labs – Research
Unblocking streaming operators • Data stream management systems (DSMS) work with infinite stream of tuples • How to get answers out of join, aggregation, etc., before the end of time? • limit the scope of output tuples which input tuple can affect • Two views • define a window over the input streams for the blocking operators (STREAM, TelegraphCQ) • use a pipelined operator, make use of an existing sort order (Gigascope, Tribeca) • most queries make reference to timestamps
Unblocking streaming operators • Some stream attributes are labeled with temporal properties (e.g monotone increasing) • In aggregation query one grouping attribute must have a timestampness : SELECT tb, srcIP, count(*) FROM TCP GROUP BY time/60 as tb, srcIP tb is infered to be monotone increasing too • Similarly stream merge (union) and join also need to have a set of attributes that have temporal properties
What if a data streams stalls? • Consider a query that merges multiple streams • Presence of tuples carries temporal information, absence doesn't • memory overflow at merge • Similar issues with every operator with multiple input streams (e.g. joins)
Stream Punctuations • Unblock operators by embedding special marks in the stream • indicate the end of the subset of the data • Stalled stream can notify the parent about the end of the epoch Lots of issues • How these punctuations can be generated and propagated? • How do we integrate such a mechanism into high-performance DSMS?
Gigascope Architecture App • DSMS designed for monitoring high-rate data streams • pure stream database (no stored relations or continuous queries) • pipelined operators that rely on temporal properties of the stream • Two layer architecture for early data reduction • fast lightweight data reduction queries (LFTA) • high level queries for expensive processing (HFTA) high high low low low ring buffer NIC
Pipelined Operators • Aggregation: SELECT tb, srcIP, count(*) FROM TCP GROUP BY time/60 as tb, srcIP • Merge operator performs a union of two streams R and S in a way that preserves timestamps: MERGE R.tb : S.tb FROM Inpackets R, Outpackets S • A join query on streams R and S must contain a join predicate such as R.tb=S.tb : SELECT R.sourceIP, R.tb, R.length_sum + S.length_sum OUTER_JOIN from Inpackets R, Outpackets S where R.sourceIP = S.destIP and R.tb = S.tb
Gigascope heartbeats • Initially designed to collect statistics about operator load • Special messages propagated using regular tuple routing mechanism • performance monitoring • failure detection
Unblocking operators using heartbeats • Stream punctuation mechanism • injects special temporal update tuples into operator’s output stream • notifies the operator about the end of subset of a data (end of the time window on aggregations, stream merge and joins operate) • Heartbeats are the perfect vehicles for carrying the temporal update tuples • regular propagation through operator DAG • unblocks all operators on its way in timely manner
Temporal update tuples • Temporal update tuples generated by operator have a schema identical to regular tuple • only values of temporal attributes are initialized (the rest is ignored) • future tuples are guaranteed not to violate temporal properties of the stream Operator output schema: (Timebucket, SrcIP, DestIP, PacketCount) Timebucket is monotone increasing Temporal tuple (T, Unitlitialized, Unitlitialized, Unitlitialized) • guarantees that all future tuples will have value of Timebucket >= T
Heartbeat generation • Naïve solution • operators emit last produced tuple cast as a temporal tuple • too conservative to be useful – heartbeats don’t carry any additional information • Goal: aggressively generate the values of temporal attributes • set attributes to maximum values we can safely guarantee
Heartbeat generation • Two approaches • infer the values of temporal update tuples based on tuples operator received so far • infer based on system time • Inference based on received tuples • works when operators observe some tuples but they might be filtered out by selection predicates • works on every level of query execution • Inference based on system clock • works even with completely stalled streams • only for time based temporal attributes • potentially dangerous
Inferring temporal attributes • Every operator maintains state required to correctly generate temporal update tuples • last seen values of all temporal attributes referenced in select clause • operator specific state • Attribute values for temporal tuples are computed using inference rules SELECT tb, srcIP, count(*) FROM TCP GROUP BY time/60 as tb, srcIP If last seen value of time is X, infer that the value of tb for temporal update tuple should be X/60
Inferring temporal attributes • What if the stream is completely stalled? • cannot advance values of temporal attributes • Inference based on system time • works in the temporal attribute can be correlated with system clock (usually the case in network streams) • unsafe for high level operators (need to reason about propagation delays) • need to be careful about the clock skew • Gigascope uses skew information entered by admin to infer the values of temporal attributes
Selection & merge operators • Selection operator (filtering): • save the last seen values of temporal attributes regardless of whether tuple passes selection predicate • Merge (stream union): • combines multiple streams while preserving ordering properties • Requires buffering of input streams • maintains minimum timestamp values observed by every input • S1_ max, S2_max, … Sn_max • Uses MIN(S1_ max, S2_max, … Sn_max) to generate temporal update tuple
Aggregation & sampling operator • Maintains hash table of aggregates for current time window • when the time window advances the table content is flushed • uses traffic shaping (slow flush) to avoid flushing excessive amounts of data • Slow flush can lead to incorrect generation of temporal tuples • if there is some unflushed tuples in hash table, generate temporal tuples based on unflushed tuples • otherwise uses last seen values saved by operator
Join operators • Stream join between R and S relates timestamp from R to timestamp in S (e.g. R.ts = S.ts) • critical for guaranteeing bounded memory • supports inner and,right,and full outer equi-joins • Maintains maximum values of timestamps observed on each stream (Rmax and Smax) • Rmax and Smax can be composite structures storing max values of all attributes that a part of timestamp • Infers the values of attributes of temporal update tuples based on MIN(Rmax, Smax)
Experimental Evaluation • Two main data feeds • DAG4.3GE Gigabit Ethernet interfaces • 100,000 packets/sec (about 400Mbit/sec) • One low-rate control data feed • 100Mbit interface • Good representative of backup interface • Dual 2.8 GHz P4 server w/ 4 GB of RAM, FreeBSD 4.8
Merge Query High-level Aggregation SELECT tb, protocol, srcIP, destIP, srcPort, destPort, count(*) FROM DataProtocol GROUP BYtime/10 as tb, protocol, srcIP, destIP, srcPort, destPort Stream Merge Stream Merge Low-level Aggregation Low-level Aggregation Low-level Aggregation control main1 main2
Outer Join Query Query flow1: SELECT tb, protocol, srcIP, destIP, srcPort, destPort, count(*) as cnt FROM [main0_and_control].DataProtocol GROUP BY time/10 as tb,protocol,srcIP,destIP,srcPort,destPort; Query flow2: SELECT tb, protocol, srcIP, destIP, srcPort, destPort, count(*) as cnt FROM main1.DataProtocol GROUP BY time/10 as tb, protocol, srcIP, destIP, srcPort, destPort; Query full_flow: SELECT flow1.tb, flow1.protocol, flow1.srcIP, flow1.destIP, flow1.srcPort, flow1.destPort, flow1.cnt, flow2.cnt OUTER_JOIN FROM flow1, flow2 WHERE flow1.srcIP=flow2.srcIP and flow1.destIP=flow2.destIP and flow1.srcPort=flow2.srcPort and flow1.destPort=flow2.destPort and flow1.protocol=flow2.protocol and flow1.tb = flow2.tb
Outer Join Query Outer Join High-level Aggregation High-level Aggregation Stream Merge Low-level Aggregation Low-level Aggregation Low-level Aggregation backup main1 main2
Performance Evaluation CPU load w/ heartbeats enabled – 37.5% w/ heartbeats disabled – 37.3%
Other heartbeat applications • Fault tolerance • Heartbeats regularly propagate through query DAGs • Easy detection of failed nodes • System performance analysis • Every heartbeat message is timestamped by receiving node • Timestamp traces are perfect for analyzing queuing delays • Distributed query optimization • Every heartbeat message carries runtime statistics (operator selectivities, sampling rates, in/out rates, memory footprint, etc) • Collected statistics can be fed to distributed query optimizer
Conclusions • Punctuation carrying heartbeats • effective at unblocking streaming operators on all levels • significantly reduce query memory utilization • capable at working on multiple Gigabit line speeds • Variety of other uses • fault tolerance, performance analysis, distributed query optimization • Part of production version of Gigascope