230 likes | 254 Views
3D Project Status Report. Maria Girone, CERN IT on behalf of the LCG 3D project http://lcg3d.cern.ch LHCC Comprehensive Review 19th-20th Nov 2007. Introduction . Set up a distributed database infrastructure for the WLCG (according to the WLGC MoU) RAC as building-block architecture
E N D
3D Project Status Report Maria Girone, CERN IT on behalf of the LCG 3D project http://lcg3d.cern.ch LHCC Comprehensive Review 19th-20th Nov 2007
Introduction • Set up a distributed database infrastructure for the WLCG (according to the WLGC MoU) • RAC as building-block architecture • Several 8-node clusters at Tier0 • Typically 2-node clusters at Tier1 • Oracle streams replication used to form a database backbone between. In production since April ‘07 • Tier0 and 10 Tier1 sites • ATLAS and LHCb • Online and offline at Tier0 • ATLAS, CMS and LHCb • Frontier/SQUID from Fermilab for distributing and caching database data via a web protocol is used by CMS • Database services listed among the critical services by the experiments
Building Block- Database Clusters • Tier 0: all networking and storage redundant • Scalability and high availability achieved • Storage and CPU scale independently • Maintenance operations w/o down-time!
Oracle Streams • New or updated data are detected and queued for transmission to destination databases • Database changes captured from the redo-log and propagated asynchronously as Logical Change Records (LCRs) • All changes are queued until successful application at all destinations • need to control change rate at the source in order to minimise the replication latency • 2GB/day user data to Tier 1 can be sustained with the current DB setups ASGC TRIUMF
Downstream Capture & Network Optimisations • Downstream Database Capture to de-couple Tier0 production databases from Tier1 or network problems is in place for ATLAS and LHCb • TCP and Oracle protocol optimisations yielded significant throughput improvements (almost factor 10)
Frontier/Squid • Numerous significant performance improvements since CSA’06 • experiment data model • frontier client/server software • CORAL integration • CMS is confident that possible cache coherency issues can be avoided by • Cache expiration windows for data and meta-data • Policy implemented by the client applications • Successfully used also in CSA’07
Streams Replication Operation • Streams procedures included in the Oracle Tier0 physics database service team • Optimized the redo log retention on downstream database to allow for sufficient re-synchronisation window without recall from tape (for 5 days) • Preparing a review of the most recurrent problems for the WLCG Service Reliability Workshop, 26-30 Nov 2007 • Will be the input for further automation of service procedures
Some Issues • Several bugs reported to Oracle fixed or being fixed • CERN openlab has set an excellent ground! • Some user-defined types not supported by streams • Reformatting is done with filtering rules • LFC has been hit by a known Oracle bug as it was running on on a older instant client version • Syncronization with the AA instant client established • Set up read-only replicas on other 5 sites together with code change to support user ACLs • Automate more the split-merge procedure when one site has to be dropped/re-syncronized • Progress in Oracle 11g but we need a stop-gap solution • Implement failover for the downstream capture component
Streams problems • Capture process is aborted with error ORA-01280: Fatal LogMiner • Fixed in patches 5581472 and 5170394 related to the capture process and the logminer • Bug 6163622 SQL apply degrades with larger transactions • Fixed applying patch 6163622 • Bug 5093060 STREAMS 5000 LCR limit is causing unnecessary FLOW CONTROL at apply site • Fixed applying patch 5093060 • Bug recyclebin=on; after child table dropped:ORA-26687 on parent • Metalink note 412449.1 • Fixed on 10.2.0.4 and 11g • APPLY gets stuck: APPLY SERVER WAITING FOR EVENT • Generic performance issue on RAC • Fixed by applying patches 5500044 for the fix for <bug:5977546> and One off/Patch of 5964485 • OEM agent blocks Streams processes • Fixed applying patch 5330663 • Workarounds found for all the open bugs: • ORA-600 [KWQBMCRCPTS101] after dropping Propagation job • ORA-26687 after table is dropped when there are two streams setups between the same source and destination databases • ORA-00600: [KWQPCBK179], memory leak from propagation job • Observed after applying the fix patch 9
Service Levels and Policies • DB Service level according to WLCG MoU • need more production experience to confirm manpower coverage at all T1 sites • piquet service being set-up at Tier 0 to replace existing 24x7 (best effort) service • streams interventions for now 8x5 • Criticality of this service rated by most experiments (ATLAS, CMS and LHCb) as “very high” or “high” • Proposals from CERN Tier 0 have been accepted also by the collaborating Tier 1 sites • Backup and Recovery • RMAN based backups mandatory, data retention period 1 month • security patch frequency and application procedure • database software upgrade procedure • patch validation window
Database & Streams Monitoring • Monitoring and diagnostics has been extended and integrated in the experiments’ dashboards • Weekly/monthly database and replication performance summary has been added • extensive data and plots about replication activity, server usage, server availability available form the 3D wiki site (html or pdf) • summary plot with LCR rates during last week is show on the 3D home page and could be referenced/included into other dashboard pages • Complemented by weekly Tier 0 database usage reports which are in use already since more than one year
Integration with WLCG Procedures and Tools • 3D monitoring and alerting has been integrated with WLCG procedures and tools • dedicated workshop at SARA in March 2007 focussing on this • Interventions announced according to the established WLCG procedures • eg EGEE broadcasts, GOCDB entries • To help reporting to the various coordination meetings we collect all 3D intervention plans also on the 3D wiki • Web based registration will be replaced as soon as a common intervention registry is in production
Intervention & Streams Reports A factor 5 improvement in apply speed by tuning of filtering rules
Tier 1 DB Scalability Tests • The experiments have started to evaluate/confirm also the estimated size of the server resources at Tier 1 • number of DB nodes CPUs • memory, network and storage configuration • Need realistic work-load which now becomes available as experiment s/w frameworks approach complete coverage of their detectors • ATLAS conducted two larger tests with ATHENA jobs against IN2P3 (shared solaris server) and CNAF (Linux) • Total throughput of several thousand jobs/h achieved with some 50 concurrent test jobs per Tier 1 • LHCb s/w framework integration done and scalability tests starting as well • Lower throughput requirements assumed than for ATLAS • Tests with several hundred concurrent jobs in progress
Database Resource Requests • Experiment resource requests for T1 unchanged since more than one year • 2 (3) node DB cluster for LHCb (ATLAS) • fibre channel based shared storage • 2 squid nodes for CMS • standard worker node with local storage • Setup shown to sustain replication throughput required for conditions data (1.7 GB/d ATLAS) • ATLAS has switched to production with the Tier1 sites • LHCb requested read-only LFC replicas at all 6 Tier1 sites as for conditions • Successful replication tests for LFC between CERN and CNAF • Updated requests for LHC startup have been collected at the WLCG workshop at CHEP’07 • No major h/w extension requested at that time
Online-Offline Replication • Oracle Streams used by ATLAS, CMS and LHCb • Joint effort with ATLAS on replication of PVSS data between online and offline • required to allow detector groups to analyse detailed PVSS logs without adverse impact on online database • Significantly higher rates required than for COOL based conditions, which have been confirmed in extensive tests • some 6 GB of user data per day • Oracle Streams seems an appropriate technology also for this area
Backup & Recovery Exercise • Organised a dedicated database workshop at CNAF in June 2007 on recovery exercise • show that database implementation and procedures at each site are working • show that coordination and re-synchronisation after a site recovery works • show that replication procedures continue unaffected while some other sites are under recovery • Exercise well appreciated by all participants • several set-up problems haven been resolved during this hands on activity with all sites present • six sites have now successfully completed a full local recovery and re-synchronisation • remaining sites will be scheduled shortly after the Service Reliability Workshop
More Details • LCG 3D wiki • interventions, performance summaries • http://lcg3d.cern.ch • Recent LCG 3D workshops • Monitoring and Service Procedures w/s @ SARA • http://indico.cern.ch/conferenceDisplay.py?confId=11365 • Backup and Recovery w/s @ CNAF • http://indico.cern.ch/conferenceDisplay.py?confId=15803 • Next @ LGC Service Reliability Workshop • http://indico.cern.ch/conferenceOtherViews.py?view=standard&confId=20080
Approved by the MB on 13.03.07 TAGS at volunteer Tier-1 sites (BNL, TRIUMF, ...)
Updated Request from ATLAS Not yet approved. May need funding discussion ATLAS provides more info at https://twiki.cern.ch/twiki/bin/view/Atlas/DatabaseVolumes 20
Summary • The LCG 3D project has setup a world-wide distributed database infrastructure for LHC • Close collaboration between LHC experiments and LCG sites • with more than 100 DB nodes at CERN + several tens of nodes at Tier 1 sites this is one of the largest distributed database deployments world-wide • Large scale experiment tests have validated the experiment resource requests implemented by the sites • Backup & recovery tests have been performed to validate the operational procedures for error recovery • Regular monitoring of the database and streams performance is available to the experiments and sites • Tier 0+1 ready for experiment ramp-up to LHC production. Next steps: • Replication of ATLAS Muon calibrations back to CERN • Test as part of CCRC’08 • Started testing 11gR1 features. Production deployment plan for 11gR2 by end of 2008