1 / 18

Database Replication

Database Replication. Heinz Stockinger CERN-EP/CMC University of Vienna. Outline. Requirements for Distributed and Replicated Databases Objectivity/DRO 5.2 tests between CERN and Japan Limitations of DRO Possible replication methods Communication between federations with CORBA.

india
Download Presentation

Database Replication

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Replication Heinz Stockinger CERN-EP/CMC University of Vienna

  2. Outline • Requirements for Distributed and Replicated Databases • Objectivity/DRO 5.2 tests between CERN and Japan • Limitations of DRO • Possible replication methods • Communication between federations with CORBA Database Replication, 2

  3. Distributed/Replicated Databases • CMS foresees Regional Centres to enhance computing facilities • [figures based on Monarc] • raw data: 100% CERN, 5% RC • reconstructed data 100% CERN, 50 - 100% RC • tag data 100% CERN, 100% RC Regional Centre Regional Centre Regional Centre Database Replication, 3

  4. Some use cases • Reconstruction at CERN - copies of these data should be available in RCs with a certain delay of time • tag data should be available “everywhere” • analysis of data will be done at CERN and RCs • not all data are replicated: • enable remote data access with a reasonable low response time Database Replication, 4

  5. Objectivity/Data Replication Option • three possible replication methods: • 1. Synchronous replication: write to a replicated DB needs a quorum of all other replicas • replicate (empty DB) - populate • 2.“Asynchronous” methods: 2.a) populate - replicate 2.b) replicate - off-line - populate - on-line (recalculation of quorum in V5.2) Database Replication, 5

  6. Objectivity/DRO 5.2 tests CERN - Japan • 3 different methods have been tested on LAN • tests extended to WAN (thanks to Youhei Morita!, Monarc) • DRO 5.1 tests already done by Hiroyuki Sato (KEK) • between CERN and Japan: 2 Mbps network link • machines used: • monarc01, at CERN • arksol1, at KEK • arksol2, at KEK Database Replication, 6

  7. Replicate - populate on WAN 8k page 32k page Small objects: 1 int 2.2 MB written 2.7 MB written Cache250 Cache10 Cache250 ~2.5 s ~2.5 s ~2.7 s DB1 local ~880 kB/s ~1080kB/s ~1000 kB/s remote ~56 s ~48-72 s ~38 s ~40kB/s ~36-~54kB/s ~71 kB/s “Repl” write 1 DB1 AP1 DB1 AP2 ~50 s ~124 s ~62-71 s ~2x17 kB/s ~2x37/43 kB/s ~2X54 kB/s AMS AMS WAN “Repl” write 2 ~73 s ~64 s ~131 s DB1 AP1 DB1 AP2 ~3x17 kB/s ~3x37 kB/s ~3x42 kB/s DB1 AP3 AMS AMS Less time for commit AMS WAN FTP ~77kB/s Database Replication, 7

  8. Populate - replicate on WAN • RPC timeout has to be adjusted • test done with 32k page: • replicate 4.5 MB • ~356s = ~12 kB/s • compare to replicating empty DB: • replicate 1.8 MB • ~180s = ~10 kB/s • see Hiroyuki Sato: much overhead on handshake • replicated write looks much better Database Replication, 8

  9. Conclusion on Objectivity/DRO • Features are not adequate for event data • DRO only works for a single federation • but we still want to have fully replicated Objy DB files Database Replication, 9

  10. My aims for replicated federations • current solution: • reserve DB IDs for federations • manually copy DB - FTP the DB - attach the DB to another destination • don’t want to use manual FTP to copy DB files • want to automate this procedure • want to add other features like • replicate only containers or certain objects • asynchronous replication (on demand) • communication between federations (e.g. to synchronise fd catalogue operations) • want to study possible update mechanisms Database Replication, 10

  11. Replication algorithm for Objy DB files • DB IDs have to be identical • synchronisation only needs to be done on the DB catalogue • total order of DB creation among all federations (DB IDs can be reserved for certain federations) • there is no need to synchronise the actual writing of data • relax consistency of data (too often used in replication research but in HEP it is a real feature) • transfer DB files asynchronously to federations when there is/are time/resources • can be do in a batch mode • network traffic can be monitored: send file when the network is not heavily used Database Replication, 11

  12. CORBA: A Communication mechanism • Federations need to “communicate” with each other • assume each RC (including CERN) has one federation ObjyFD CERN CORBA ObjyFD RC1 ObjyFD RC2 CORBA CORBA WAN Database Replication, 12

  13. CORBA as replication server • Each federation has a CORBA replication server: • transferring files (entire DB files or only parts of a DB) • exchange synchronisation messages (RPC) • execute update on local federation • propagate new data (changes) to other federations • the usage of CORBA servers is a research project and needs a deep understanding of how CORBA can be applied for • efficient data transfer • possibility of remote access of data Database Replication, 13

  14. Efficient data transfer • CORBA does not have too outstanding “high-performance” features but: • the transfer of DB files to RCs does not have to be in real time (can have a certain delay) • communication mechanism and client-server paradigm is dominating • I plan to convert parts of the event data into XML • and send XML files over the network • integrate the XML files into the remote federation • this work is part of the WISDOM project Database Replication, 14

  15. ... container DB Preliminary work on Partial Replication in Objy • investigation of copying single containers • problem with associations that lead to other containers • principle tests: can a single container with associations be copied without knowing about the schema? NO: • deep copy of Objectivity is not sufficient • 1:1 associations can’t be deep copied • ooCopyInit () can’t use a parameter (always copy to the same container) Database Replication, 15

  16. ... container DB Active Schema • can be used to create persistent objects and • associations between persistent objects • no need to get a the correct handle/reference to the object to be copied/created Database Replication, 16

  17. Open research questions • In how far can a transparent system for Regional Centres be established? • Remote data access with CORBA servers: • read from the “closest replica” in terms of network cost and server load: • monitoring tools have to be explored and integrated into a “remote access engine” • What happens if metadata (DB schema) changes? • How to deal with different versions of objects? Database Replication, 17

  18. Summary • Needs for replicated data for the CMS experiment have been investigated • currently available replication features of Objectivity have been studied and tested • Objectivity/DRO does not satisfy our needs • alternative solutions have to be provided • I have chosen CORBA as a communication mechanism for further research Database Replication, 18

More Related