290 likes | 299 Views
Scalable Trigger Processing. Eric N. Hanson et al ICDE 1999 Presented by Shyamshankar D. Motivation. Use of Triggers Integrity constraint checking Alerting etc. Commercial database systems Limited triggering capabilities 1-100 triggers/table.
E N D
Scalable Trigger Processing Eric N. Hanson et al ICDE 1999 Presented by Shyamshankar D
Motivation • Use of Triggers • Integrity constraint checking • Alerting etc. • Commercial database systems • Limited triggering capabilities • 1-100 triggers/table. • Doesn’t scale to internet and World wide web requirements
An Example • A Continuous query • Stock holding: 100*IBM • Query: Inform me whenever the price of the stocks crosses 10,000$ Create Trigger stock-watch from quotes q on update(q.price) when q.name=‘IBM’ and 100*q.price > 10,000 do raise event ThresholdCrossed(100*q.price).
What Next? • The problem • The key concept • TriggerMan system architecture • Predicate Index • Trigger processing • Concurrency
The Problem • Requires millions of triggers. • Steps for trigger processing • Event monitoring • Condition evaluation • Executing the trigger action • Response time for the database operations.
The Key Concept • If large number of triggers are created, then most of them have the same format. • Triggers share the same expression signature except that one constant has been substituted by another. • Group predicates from trigger conditions based on expression signatures and store these in an efficient main memory data structure.
The TriggerMan System • Trigger system built on Informix database • Could also be implemented in a trigger system in a DBMS server. • Trigger syntax create trigger <triggerName> [in setName] [optionalFlags] from fromList [on eventSpec] [when condition] [group by attributeList] [having groupCondition] do action
Example 1 Update Fred’s salary when Bob’s salary is updated create trigger updateFred from emp on update(emp.salary) when emp.name = ’Bob’ do execSQL ’update emp set salary=:NEW.emp.salary where emp.name=’’Fred’’’
Example 2 house(hno,address,price,nno,spno) salesperson(spno,name,phone) represents(spno,nno) neighborhood(nno,name,location) create trigger IrisHouseAlert on insert to house from salesperson s, house h, represents r when s.name = ‘Iris’ and s.spno=r.spno and r.nno=h.nno do raise event NewHouseInIrisNeighborhood(h.hno, h.address)
Components • TriggerMan Datablade • Data Sources • Local/remote tables/streams • TriggerMan Client applications • Create drop triggers etc. • TriggerMan Driver • Condition testing and action TmanTest() fn. • TriggerMan console • Direct user interaction interface
Trigger Condition structure • Expression signature • Expression signature consists of • Data source ID • Operation code • Generalized Expression Data src: emp Event : update = Emp.name CONSTANT
Condition structure (contd) • Steps to obtain canonical representation of WHEN clause • Translate expression to CNF • Group each conjunct by data source they refer. • Predicate will be of form (C11 OR C12 OR ..) AND ... (Ck1 OR …), where each Cij refers to a tuple variable. • Very large number of predicates created only if different triggers contain distinct CONSTANT values. • If expression has m constants replace xth constant by CONSTANTX
The Tables • Primary tables • trigger_set (tsID, name, comments, creation_date, isEnabled) • Trigger (triggerID, tsID, name, comments, trigger_text, creation_date, isEnabled, …) • Trigger cache in main memory for recently accessed triggers.
Predicate Index • Tables • expression_signature(sigID, dataSrcID, signatureDesc, constTableName, constantSetSize, constantSetOrganization) • const_tableN(exprID, triggerID, nextNetworkNode, const1, … constK, restOfPredicate) • Root of predicate index linked to set of data source predicate index • Each data source contains an expression signature list • For each expression signature containing constants, a constant table. • Index expressions on most selective conjunct.
Condition Testing • For Predicate to be satisfied all conjuncts should be true. • This is checked using an A-Treat network. • A-Treat network is a discrimination network for trigger condition testing.
A-Treat network(Hans 92) Define rule SalesClerk If emp.sal>30,000 And emp.dno=dept.dno And dept.name=“sales” And emp.jno=job.jno And job.title=“clerk” Then Action
Processing trigger definition • Parse the trigger and validate it • Convert the when clause to conjunctive normal form and group the conjuncts by the distinct sets of tuple variables they refer to • Form a trigger condition graph. Undirected graph with node for each tuple variable and edge for join predicates. • Build the A-Treat network
Processing trigger definition(contd) • For each selection predicate over a alpha node • If predicate with same signature, not seen before • Add signature of predicate to list and to expression_signature table • If signature has a constant placeholder in it, create a constant table for the signature. • Else if predicate has constants, add a row to the constant table for the expression
Alternate Organization strategies • Storage for the expression signature’s equivalence class: • Main memory lists • Main memory index • Non-indexed database table • Indexed database table • For each expression signature use a structure depending on number of triggers. efficiency Scalability
Processing update descriptors • On getting an update descriptor/token (data src ID, oper code, old/new tuple) • Locate data source predicate index from root of predicate index. • For each expression signature, find constant matching the token. • Check additional predicate clauses against the token. • When all predicate clauses of a trigger have matched, pin the trigger in main memory • If trigger condition is satisfied, execute action.
Some optimizations For I = 1 to N Create trigger Ti from R when R.a=100 do … • List of trigger ids for R.a=100 • Constant sets and trigger Ids for equality conditions stored as • Lists • Clustered constant index • All entries of [const1,..constk] stored together • Enables fast retrieval of triggers ids together
Concurrent processing • Different types of concurrency • Token level concurrency • Multiple tokens in parallel • Condition level concurrency • Test multiple selection condns against token concurrently • Rule action concurrency • Process multiple actions fired at same time. • Data level concurrency • Current implementation supports token level concurrency
Concurrency N=[NUM_CPUS*TMAN_CONCURRENCY_LEVEL] • N driver processes • Each driver process calls TmanTest() after T time • T=future work • Balance switching overhead and avoid long executions
TmanTest() while(total execution time of this invocation of TmanTest < THRESHOLD and work is left in the task queue) { Get a task from the task queue and execute it. Yield the processor so other Informix tasks can use it } if task queue is empty return TASK_QUEUE_EMPTY return TASKS_REMAINING
Concurrency(contd) • Task can be • Process a token to check matching rule • Run a rule action • Process a token against a set of conditions • Process a token to run a set of rule actions triggered by token.
Concurrency For k=1 to M Create trigger T_K from R when R.company = "IBM" do raise event notify_user("userK", …. ) • Partition trigger sets into N sets.
Conclusion • An architecture to build a truly scalable trigger systems. • Future work • Scalability for temporal conditions • Aggregates