110 likes | 284 Views
Transaction-based Grid Data Replication Using OGSA-DAI . Presented by Yin Chen . February 2007. What is replication?. Initial copying of data & synchronization of updating Is not Cashing Client phenomenon Only for improving response time Is not a Backup
E N D
Transaction-based Grid Data Replication Using OGSA-DAI Presented by Yin Chen February 2007
What is replication? • Initial copying of data & synchronization of updating • Is not Cashing • Client phenomenon • Only for improving response time • Is not a Backup • Not automatically overwritten when the original data is modified • Normally, cannot directly access
Why do we need it? • Data consolidation (central audit & analyse) • Data distribution (for branch labs) • Performance • Access efficiency (moving data near processing) • Load balancing (distributing access load) • Security (data protection) • Availability (off-line access) • Reliability (disaster recovery, avoiding single point of failure)
Challenges of Grid database replication • How to copy the large data among heterogeneous DBs • How to maintain the consistency of data in a highly distributed network environment • How to discover & self-repair the dead parts
Problems of existing technologies • Existing Grid “replication” systems • E.g. the EDG replica manager/ the Globus data replication service/ SRB • Support large dataset copying • Yet, merely deal with files • Too simple (e.g. not support updating, database replication, etc.) • Not consistent • Relational database replication tools • E.g. Oracle/ Sybase/ DB2/ MySQL replication • Very flexible (e.g., portion copy, bi-direction update) • Yet, not suit for virtual organizations (e.g. can’t copy large data/ difficult to search for replicas)
Relational Database Replication Mechanism Metadata Catalogue Data Resource Data Replica Replication Control Service Transfer Service Architecture Data flow directions
Request Relational Database Replication Mechanism Metadata Catalogue Data Resource Replication Target Transfer Service Replication control workflow Replication Control Service Metadata Search Engine Initiator Selector Starter Metadata Register
OGSA-DAI activities (ongoing) • High-level APIs to interact with relational replication mechanisms: • CreateReplicaDatabase() • DropReplicaDatabase() • ConfigReplication() • CleanUp() -- to clean up replication configuration • StartReplication() • StopReplication() • MonitorReplication() -- to check the status of each process • Control the workflow of data replication, i.e. • sequence.addChild(createDB2RelicaDB); • sequence.addChild(configDB2Replication); • sequence.addChild(startDB2Replication);
Admin: create replication criteria control table • Capture: use log/trigger to capture the changes temp table • Apply: scheduled apply transactions accumulated target DB • Alert Monitor: monitor and notify users • Supports: after-image copy / before-image copy (can rollback) • Allows subset/simple view/ complex joins & unions copy • Asynchronous replication, allows specifying schedule IBM Replication IBM DB2 SQL Replication
Features • Combine Relational Database Replication with Grid technologies, to gain benefits from both • Keep the features of relational database replication • Supporting more scalable, secure, high performance data access • Explore the abilities of OGSA-DAI to control workflows
Information • Project members: • Dave Berry (NeSC, UK) • Patrick Dantressangle (IBM, Hursley) • Yin Chen (NeSC, UK) • Simon Laws (IBM, Hursley) • Project website: • http://www.aiai.ed.ac.uk/~ychen/ibm_ogsadai/ibm-ogsadai-index.html