260 likes | 284 Views
Distributed Triggers for Peer Data Management. Verena Kantere 1 , Iluju Kiringa 2 , Qingqing Zhou 3 , John Mylopoulos 3 , Greg McArthur 3 1 National Technical University of Athens 2 University of Ottawa 3 University of Toronto. Motivation.
E N D
Distributed Triggers for Peer Data Management Verena Kantere1, IlujuKiringa2, Qingqing Zhou3, John Mylopoulos3, Greg McArthur3 1National Technical University of Athens 2University of Ottawa 3University of Toronto CoopIS 2006
Motivation We consider a P2P DB overlay of the health domain Mrs Smith visits Dr F with abdominal aches Get recent tests Get recent prescriptions Doctor Specialist Pharmacist
Motivation cont’d Situation: • if Dr F works in a walk-in clinic, he does not wish to keep complete records on his irregular patients → query and update time data coordination is enough for instant need for information • yet, for his regular patients, Dr F wants to keep as complete a medical history as possible →data updates must be complemented by a more general coordination mechanism
Contributions • We propose distributed triggers • Employment of them to enable peer data exchange and coordination: we define a distributed rule language • Form execution semantics for distributed triggers: we extend the standard semantics of centralized SQL3 triggers • Dealing with the transient character of pDBMSs: we present appropriate trigger termination and acquaintance protocols • Discussion of the viability of our solution for coordinating updates in pDBMSs: we present preliminary experimental results
Example of Distributed Trigger Trigger 1:When Dr F is prescribing a medicine to a patient, hewants this prescription to be inserted both -in his own database -in the database ofthe pharmacist, P_DB → insertions alone are not enough: P_DB should be updated automatically when such new information occurs. Need for distributed triggers: events, conditions, actions may concern different databases CREATE TRIGGER prescriptionInsertion AFTER INSERT ON Dr F Prescription REFERENCING NEW AS NewPresc IN Dr_F_DB FOR EACH ROW BEGIN INSERT INTO Dr P Prescription VALUES NewPresc IN Dr_F_DB END Event Action
Next: • Define distributed triggers • Discuss execution semantics
Distributed Trigger Language Example of distributed triggers: CREATE TRIGGER prescriptionInsertion AFTER INSERT ON Dr F Prescription REFERENCING NEW AS NewPresc IN Dr_F_DB FOR EACH ROW BEGIN INSERT INTO Dr P Prescription VALUES NewPresc IN Dr_F_DB END • We extend SQL3 centralized triggers definition • Requirement for unique name over all pDBMSs • Single event, single condition, many actions • The action is a set of separate transactions
Execution Semantics The extended execution semantics of an SQL3 statement S is: EXECUTE STATEMENT(S) 1. Save current database state change 2. Determine the set of tuples changed by S: A(S) 3. Create a state change C with C = (R,E,T), where R is thetable mentioned in S, Eis the update operation mentioned in S, and T are the changes to be performed in A(S) 4. PROCESS BEFORE TRIGGERS(C) (i.e., trigger is executed before the execution of S) • Adjusted for distributed triggers (talk later…) 5. Apply changes (T) to the database. 6. PROCESS AFTER TRIGGERS(C). • Adjusted for distributed triggers (talk later…) 7.PROCESS DETACHED AFTER TRIGGERS(C). • Added for distributed triggers (talk later…) 8. Restore the state change saved in Step 1 (if any).
Execution Semantics cont’d The semantics involve: • One coordinator: the peer that controls the outcome of the trigger processing • The coordinator is the database on which the event occurs (this can be different from the definition peer!) • A set of participants: • One participant peer for condition evaluation • Several participants peers for action execution Note that: • Actions are separated in sets for distinct peers • Local actions are executed prior to other actions: • Local part is reduced to a centralized SQL3 trigger
BEFORE & AFTER Triggers Row-level triggers: if condition evaluation does not interfere with action execution we can pre-compute conditions for all rows → reduction of communication load This is true for: • BEFORE triggers • AFTER triggers: if no action influences the tables involved in the condition
Detached AFTER Triggers • Semantics of BEFORE & AFTER triggers: compliant with immediate mode of SQL3 → no trigger execution if peers are offline • Yet, ‘offline’ is a valid state for pDBMSs → we introduce a variation of AFTER triggers: DETACHED AFTER triggers: • Intuition: execute parts of the trigger in online and offline state • considered for execution when peers are on- and off- line • treated as AFTER triggers if all peers are online • optional and user-decidable execution mode • considered for local firing in offline state
Detached AFTER Triggers cont’d Trigger2: Dr F wants to be informed about lab tests performed on any of his patients by Dr S → Dr_F_DB should be updated for tests occurring while it is offline CREATE TRIGGER testInsertion DETACHED AFTER INSERT ON Dr_S_Tests REFERENCING NEW AS NewTest IN Dr_S_DB FOR EACH ROW WHEN EXISTS SELECT P. pid FROM Dr_F_Patients P WHERE P.pid = NewTest.pid AND P.primary_dr ='F‘ BEGIN INSERT INTO Dr_F_Tests VALUES NewTest IN Dr_F_DB END
Detached AFTER Triggers cont’d Trigger 1:When Dr F is prescribing a medicine to a patient, hewants this prescription to be inserted both in his own database and in the database ofthe pharmacist, P_DB • The prescription should not be ‘hanging’ in case P_DB is offline • It is possible that the patient would receive the medicine with a long delay • If P_DB is offline, Dr F would like to contact another pharmacist
Processing Distributed Triggers Other issues we consider for a pDBMS setting and distributed triggers processing: • Termination protocol: what happens to triggers when a peer leaves the net? • Acquaintance protocol: what happens when a peer needs to join/leave/connect/disconnect the net?
Termination Protocol The termination protocol considers:1) timeouts 2) disconnections Timeout/disconnection 1. Drops trigger instance 2. Reinitiates Detached After triggers
Termination Protocol The termination protocol considers:1) timeouts 2) disconnections Timeout/disconnection 1. Continues execution 2. If Detached, saves remote actions
Termination Protocol 1. The coordinator failed in EVENT 2. Stop and ignore later condition messages The termination protocol considers:1) timeouts 2) disconnections Timeout/disconnection
Termination Protocol • The coordinator failed in • LOCAL: in case of local action • REMOTE: in case of remote action • 2. Stop and ignore later • action messages The termination protocol considers:1) timeouts 2) disconnections Timeout/disconnection
Acquaintance Protocols A peer can perform the following actions that concern its status in the P2P network: • Establish acquaintances • Abolish acquaintances • Connect to the P2P system • Disconnect from the P2P system Among other, acquaintance protocols use the data structure of a peer marker → Informally, a peer marker MPi(Pj) of Pi for Pj stores trigger information processed in offline state
Disconnecting from the P2P System A peer Pi that goes offline disables all its existing acquaintances: i.e. for each acquainted peer Pj: 1.Pi and Pj disable the existing respective mappings. 2. if Pi is in the middle of the execution of a trigger then terminate it as follows: -if Pi is in state EVENT then: if the condition is local, continue the execution else: (a) for a BEFORE/AFTER trigger discard the instance (b) for a DETACHED AFTER trigger log the instance in the respective marker
Disconnecting from the P2P System cont’d -if Pi is in state CONDITION then: continue the execution 3.Pi and Pj disable all BEFORE triggers defined in either of them 4.Pi and Pj disable all AFTER triggers defined in either of themand are not characterized as DETACHED 5.Pi and Pj exchange markers for DETACHED AFTER triggers
Experimental Results Number of Triggers vs. Time Network Delay vs. Time
Experimental Results cont’d Trigger ratio executed in each peervs. Time Trigger cost (complexity) vs.Time
Experimental Results cont’d Roll Back Ratio % vs. Time Roll Back Cost vs.Time
Related Work • G. Vargas-Solar, C. Collet, and H.G. Ribeiro. Active Services for Federated Databases. In ACM Symposium on Applied computing, pages 356–360, Como, Italy, 2000. • A. Sheth and M. Rusinkiewicz. Management of Interdependent Data: Specifying Dependency and Consistency Requirements. In Proc. of the Workshop on the Management of Replicated Data, Houston, TX, November 1990. • R. Arizio, B. Bomitali, M.L. Demarie, A. Limongiello, and Mussa. P.L. Managing inter-database dependencies with rules + quasi-transactions. In Third International Workshop on Research Issues in Data Engineering: Interoperability in Multidatabase Systems, pages 34–41, Vienna, April 1993. • iSpheres: www.ispheres.com • KnowNow: www.knownow.com