1 / 24

Prototype Tests for a Distributed File Catalogue

Prototype Tests for a Distributed File Catalogue. Andreas Joachim Peters Pierre Tissot Slawomir Biegluk Vagner Morais. Summary. Distributed File Catalogue - Architecture MySql Replication Interface Functionalities Implementation Structure Global&Local Databases

navarre
Download Presentation

Prototype Tests for a Distributed File Catalogue

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prototype Tests for a Distributed File Catalogue Andreas Joachim Peters Pierre Tissot Slawomir Biegluk Vagner Morais

  2. Summary Distributed File Catalogue - Architecture MySql Replication Interface Functionalities Implementation Structure Global&Local Databases Databases Structure Commands of the Global Mode Testbed ToDo list - Performance tests Summary

  3. Architecture Central Services Site C Site A Site B Scheduler GLOBAL MODE LOCAL MODE Assign JOB Scheduler Global Com. CE GC GC Parallel queries FC FC FC FC FC command insert insert submit GC FC insert Master Slave

  4. AliEn2 Catalogue Central Services Site C Site A Site B LFN->GUID SE DB FC SE DB SE DB GUID->PFN insert SE DB

  5. MySql Replication Master records all write queries in the binary log Slave read the binary log from the master and run the queries locally A master can have many slaves A slave can have only one master A server can be both a master and a slave Masters and slaves can selectively filter queries: Database level Table level

  6. MySql Replication Read access Update forwarded Slave Slave Write access Slave Master

  7. MySql Replication • Replication settings: • Master • server-id = <id> • log-bin = <filename> • replicate-do-db = <db_name> • Replicate-do-table = <db_name.tb_name> • Slave • server-id = <id> • master-host = <hostname> • master-user = <username> • master-password = <password> • Reasons to use replication: • Load balancing • Put data closer to users • Make backups easier

  8. MySql Replication Master Slave SSL Start Daemon Create Replica user Start Daemon Change Master Start Slave Reset Master 1 2 4 5 6 3

  9. Interface Functionalities File operations ( mkdir, register, ls, rm, rmdir, chgrp, chown, chmod, find); Access Control List ( mkacl, rmacl ); Metaview extension ( mkmeta, mkmetatag, setmetatag, rmmeta, rmmetatag ); Global functionality ( global, lsglobal, synchglobal, globaloutput );

  10. Implementation Structure mtpoolserver UI.pm dCat.pm SEAccess.pm DistributedDB.pm IPC Database.pm GlobalMode.pm

  11. Global&Local Databases Master Slave gC dCat gC

  12. Database Structure dCat Tables: {SENAME}__M0 {SENAME}__ACL {SENAME}__META {SENAME}__{MVName}__META {SENAME}__{MVName}__VIEW Slave dCat Master dcat

  13. Database Structure Table: {SENAME}__M0 ctimetimestamp ownervarchar aclIdint pathvarchar lfnvarchar pfnvarchar sizeint gownervarchar guidvarchar(36) typevarchar(1) permvarchar(3)

  14. Database Structure Table: {SENAME}__ACL aclIDint owner char ctime timestamp gowner char cowner char perm char(1)

  15. Database Structure Table: {SENAME}__META namevarchar owner varchar ctime timestamp treedir varchar subqry varchar aclId int gowner varchar perm varchar(3)

  16. Database Structure Table:{SENAME}__{MATAVIEWNAME}__META guid varchar(36) <metatag> <type> ... Table:{SENAME}__{MATAVIEWNAME}__VIEW lfn varchar guid varchar(36)

  17. Database Structure Global Commands tables: {SENAME}__global_output sesinfo Master gC gC Slave

  18. Database Structure Table: global_commands commandID int owner varchar gowner varchar timestamp timestamp command varchar parameters varchar

  19. Database Structure Table: {SENAME}_global_output seID int commandID int timestamp timestamp outval int outmsg varchar

  20. Database Structure Table: sesinfo seID int name varchar distname varchar address varchar rdbms varchar lastcommandid int

  21. Commands of the Global Mode global - register a command to be called globally lsglobal - list registered global commands synchglobal – synchronize site to a global commands list globaloutput - print output of globally called command

  22. Testbed Master Slave Slave Slave gC dCat dCat dCat Master dCat Master gC Master Slave dCat gC dCat Slave gC Slave pcaliense01.cern.ch login2.tlc2.uh.edu pcepalice34.cern.ch pcepalice66.cern.ch

  23. ToDo list - Performance tests Perl client program written to test each type of operation (insert, query, delete, etc) Perform Test several times, average taken Any entries removed before test next test run Perform tests at local and central databases Some tests: Mean add time with increasing catalog size Add rate for increasing number of clients Query rate for increasing number of clients Delete rate for increasing number of clients Update rate for increasing number of clients

  24. Summary We have implemented a distributed File Catalogue based on pure replication technology with Meta Data (schema evolution) and ACL support The FC allows an independent “local” operation mode in every site and a “global” operation mode f.e. for job scheduling Replication offers a realtime backup of the complete catalogue We have developed scripts for a fast setup of secure replication over SSL 1st performance figures look very promising (local inserts/listings ~3-4ms, replication more or less instantanious) To be done: Direct Comparison with other File Catalogues: LFC – LCG File Catalogue AliEn2 Centralized Catalogue FireMan (?) A distributed catalogue could offer large improvements in performance, scalability and site autonomy.

More Related