140 likes | 282 Views
Multiple Federations Issues. A Discussion of When and How Multiple Objectivity/DB Federations are Used in the High Energy Physics Community. DIRK DÜLLMANN & ROMUALD KNAP - RD45 COLLABORATION - CERN. When are they Used?. Consistent Multiple Federations.
E N D
Multiple Federations Issues A Discussion of When and How Multiple Objectivity/DB Federations are Used in the High Energy Physics Community DIRK DÜLLMANN & ROMUALD KNAP - RD45 COLLABORATION - CERN
Consistent Multiple Federations • Currently we use multiple consistent FDs • Decoupling • avoid unwanted lock contention • avoid unwanted changes to schema or catalogue • Partial copy for distribution or backup • transfer data to machines with slow and/or unreliable network connections to the original FD • save a backup of central meta data • Mainly workarounds to deficiencies in Objectivity/DB • lock contention on global resources (catalogue) • lack of private schema/catalogue • lack of security • lack of support of partial backups
Decoupling Processes with different Priority • Problem: exclude locks from low priority processes • e.g. production data acquisition needs to perform independent of locks obtained by offline analysis • Solution: two federations with independent lock servers • offline starts as copy of the online FD • copied FD uses a different FDid and typically a different lockserver host • periodically copy a consistent state of all updated database files to the offline FD • alternatively one could attach read-only files to both federations • Limitations: • simple if data flow is uni-directional • special care needed if data is propagated back • complete file copies needed for updated file (e.g. registries)
De-coupling Schema & Catalogue • Problem: share data but some additional private schema & catalogue • Production data (read-only) is shared between many end users. • Each of them needs private databases and schema • Solution: Multiple Cloned Federations • User federations start as clone of the production federation • complete catalog and schema is copied • files are shared read-only between master and clones • Clones could use the same FD-id and the same lockserver • users • add new schema to their clone • add new databases to their clone, • add instances of private classes to their databases
User2 Boot User1 Boot User2 FD U2DB1 U1DB1 User1 FD U2DB2 U1DB2 De-coupling Schema & Catalogue Prod Boot Prod FD Clone FD Private Schema Objects using new Schema DB1 DB2 DB3 DBn
Decoupling Schema & Catalogue • Limitations: • Special care for propagating updates • between different users or production federation and users • databases pre-allocated to users • named schema pre-allocated to user • Many copies of the (large) federation file • Inefficient since user changes are very small • Single lockserver may become bottleneck • Could use multiple lockservers and different FD id if shared databases are not updated
Federation Copies • Objy tools to be used are oocopyfd or ooinstallfd • not oonewfd + ooschemaupgrade + ooattachdb • ooattach reads each single object in a database • currently a scalability problem for large databases • not needed if databases are attached to their original database id • fixed bug in ooattachdb was breaking e.g. oobackup & oorestore • Use ooattachdb only if really needed • e.g. if db ids have changed • otherwise consider oochangedb -catalogonly • oochangedb can not create a new catalogue entry,
Federation Copies • oocopyfd • insures that original FD is in a consistent state • copies all files to another location (may need running ooams on destination host) • ooinstallfd • requires the user to transfer all files • requires the user to make sure that files are in a consistent state • Partial Copies • Just copy only part of the database files • make sure they don’t contain references that will be dangling in the destination FD • ooinstallfd -nocheck will complain, but setup the catalogue for all files found • one may later “detach” other files from the catalogue using oodeletdb -catalogonly
Federation Backup • Our backups are typically a partial copies! • Often mostly meta data backups since data is kept often read-only on tape (HPSS) • experiment registry • schema & catalogue • oobackup/oorestore don’t support partial backups • they would bring in all data from tape • no way to declare data to be read-only • Making sure that the backup is consistent • partial federation copy to temporary area (e.g. flip-flop) • release locks on original • install backup federation in temporary area • check new backup for “consistency” • e.g. using ootidy or oodump >/dev/null
Making sure that a FD is in a consistent state • Goal: copy all database files • in the same transactional state • as one atomic operation • No updates during the copy process • require no update locks on the copied databases (read locks are ok) • copy process should acquire a read lock for each copied database to make sure that no update will be allowed during the copy • alternatively: use list of locks from oolockmon • if a update lock is found: try again • need a reliable way to wait for db/fd locks • Perl example script available • HepODBMS/examples/admin
Summary • Multiple Federation are used in production as workaround for Objectivity shortcomings • significant management needed by the db-admin • Implementing a consistent FD copy is often application dependent and not trivial • Better support for multiple, consistent Objectivity/DB federations would be a significant help