360 likes | 498 Views
Denial of Service in Content-based Publish/Subscribe Systems. M.A.Sc. Candidate: Alex Wun Thesis Supervisor: Hans-Arno Jacobsen Department of Electrical and Computer Engineering Department of Computer Science University of Toronto. v0.4. Background Context of Thesis Work.
E N D
Denial of Service in Content-based Publish/Subscribe Systems M.A.Sc. Candidate: Alex Wun Thesis Supervisor: Hans-Arno Jacobsen Department of Electrical and Computer Engineering Department of Computer Science University of Toronto 2007 v0.4
Background Context of Thesis Work • PADRES middleware platform • Content-based Publish/Subscribe (CPS) • Originally inspired by distributed dashboard and job scheduling requirements • Increasingly motivated by enterprise application integration • Need to investigate different facets of security for CPS systems • Security amongst top concern in many application scenarios 2007
Contributions of Thesis Work DoS Characteristics Attack Taxonomy Attack Experiments DoS Resilience DoS Prevention Policy Model Commonality Model Matching Algorithm Policy Framework 2007
Content-based Publish/Subscribe Publication (Tuple) P P Publishers [(event,prescription), (patientID,123), (age,63), (drug,X) …] Storing Filters (Functions) Subscrip- tions Broker Network [(event=prescription), (age>50)] [(event=prescription), (drug=Y)] Subscribers S S Subscriptions (Boolean Functions) “Matching” 2007
Matching Performance Optimizations • Often based on exploiting similarities (overlap) between subscriptions • Avoid unnecessary subscription and predicate evaluations • Can we abstract these optimizations? • Formalize content-based Matching Plans (order of subscription and predicate evaluations) • Quantify performance of existing optimizations • Discover future potential optimizations 2007
Commonality Model For a subscription set Disjunctive Commonality Expression • Per-Link Matching • DNF Subscriptions or • Shared predicates • Clustering on subscription classes or attributes • “Pruning” strategies (e.g., number of attributes) Conjunctive Commonality Expression A set of commonality expressions is a subscription topology. 2007
Example: Link-Group Topology Depth First Algorithm to determine probabilistically optimal matching plan [Greiner2006] in 2007
Example: Link-Group Topology Low Selectivity X X High Selectivity o o
Example: Cluster Topology Simulation Experimental (in PADRES) o • Dramatic scalability effects of clustering in CPS • Observed trend depends on proportion of commonalities not number of predicates X . . .
Extended Implication Relationships Between subscriptions Between predicates Between commonalities 2007
Simple Implication Expressions Mixed operatorlists currently not supported 2007
Matching Engine Architecture Shared pred. index (conj. comm.) All predicates index Subscription index … … … … Subscription pool Predicate pool Map Sorted List (Map) Overlay links(disj. comm.) Node elements 2007
Matching Engine Architecture True True False D.C. False Node Elements • Node Element • Subscription • Predicate • Overlay link • (conj. comm.) • (DNF subs) D.C. Implication Lists 2007
Subscription InsertionPredicate Insertion Shared pred. index (conj. comm.) All predicates index Subscription index … … … … Subscription pool Conj. Comm. Predicate pool Unknown predicate priorities default to head of list Overlay links(disj. comm.) 2007
Subscription InsertionImplication List Update a P > 3 4 5 6 7 8 9 P’s True -> True list 3 4 5 6 7 8 9 Xi’s False -> False list 3 4 5 6 7 8 9 P’s False -> False list 2007
Performance Experiments • Generated subscription workloads from ~50 to ~200,000 predicates • {5,10,15,20} Avg. Predicates x {10,100,1000,10000} Subscriptions • 4 Different subscription topologies • Low/High clustering (5/200 classes) • Low/High sharing (subscription overlap) • Randomly generated and matched 100 publications 2007
High Cluster High Sharing Low Sharing Low Cluster
High Cluster High Sharing Low Sharing Low Cluster
High Cluster High Sharing Low Sharing Low Cluster
High Cluster High Sharing Low Sharing Low Cluster
Conclusions • Model captures many existing and potential optimization techniques • Implication list approach significantly reduces number of predicate evaluations in all workloads • Superior for expensivepredicates • Implementation trade-off: Control cascade overhead/usage • Cluster/Index implication lists as well • Optimize iteration over marked nodes • Additional clustering/indexing beyond only event class • Future work • Additional conjunctive/disjunctive commonalities, implication relationships? • Implication relationships relevant to message distribution? • Rule-based implementation of implication/commonality algorithm? Thank You – Questions? 2007
*** Extra Slides *** 2007
Publication matchingCommonality Phase Shared pred. index (conj. comm.) All predicates index Subscription index … … … … Subscription pool Predicate pool Iterate and evaluatewhile TC is false Termination Condition: All overlay links have been decided Overlay links(disj. comm.) 2007
Publication MatchingImplication Cascade True True False D.C. True False Cascade and Mark If not alreadydetermined, Evaluate D.C. D.C. False True “Advanced” implications handled with a method call triggered by state change(e.g. Predicate becomes true, calls countTruePredicate() on subscriptions) 2007
Publication MatchingSubscription Phase Shared pred. index (conj. comm.) All predicates index Subscription index … … … … Subscription pool Predicate pool Iterate and evaluatewhile TC is false Overlay links(disj. comm.) + Cascade and Mark + Cascade and Count 2007
Publication MatchingCleanup Phase • There is no cleanup phase • A counter (Vm) is incremented at the start of each publication matching phase • All determined results are versioned (Vd) • A determined result is stale if Vd < Vm • To avoid overflow, reset counter every: • 64bit counter ~= 16x10^18 pubs • @1000 pub/s ~ 16x10^15 s • ~32x10^6 s/year ~ 0.5x10^9 years 2007
Publication MatchingSorted Lists • Commonality/predicate lists sorted by (p+1/N) • p is the predicate selectivity • N is the number of subscriptions sharing the predicate • Subscriptions sorted by (1-p)n • p is average predicate selectivity • n is number of predicates • Predicate hash sorted by predicate value • Commonality/predicate/subscription sorting is meant to be extendable with different priority equations • Include predicate cost, length of implication lists, etc … 2007
High Cluster High Sharing Low Sharing Low Cluster 2007
High Cluster High Sharing Low Sharing Low Cluster 2007
Content-based Publish/Subscribe Databases Query (Boolean Function) Subscriptions (Boolean Functions) Inverse Problems Storing Functions Subscrip- tions Tables Storing Data Scalable Performance Query Plans Matching Plans? Publication (Tuple) DB Rows (Tuples) 2007