170 likes | 333 Views
WLCG Information Officer: the role and the challenges. F. Donno CERN/IT-ES. Summary. Who is the WLCG IO or who am I? What will I be doing? How would I interact with other involved parties?. The WLCG Information System. ~1200 resource ~ 374 site ~100 top =1600 servers. Consumer.
E N D
WLCG Information Officer: the role and the challenges F. Donno CERN/IT-ES
Summary • Who is the WLCG IO or who am I? • What will I be doing? • How would I interact with other involved parties? WLCG Information Officer - 2
The WLCG Information System ~1200 resource ~ 374 site ~100 top =1600 servers Consumer ?? Failure ? Information Flow Static Dynamic Information Providers WLCG Information Officer - 3
Who am I • Job description: “Single point of contact for: • Improving the operational aspects of the service • Deployment • Ensuring coherence of the Information System infrastructure and its evolution (configuration, accuracy of information, migration planning, etc.) • Coherency and accuracy • Collecting input from sites and users to facilitate an organized exchange with IS developers.” • Listen to customers Please, send e-mail to Flavia.Donno@cern.ch WLCG Information Officer - 4
What will I be doing? • Listen to customers: the experiments • ALICE: WLCG IS not used. Interested in stable and reliable CREAM-CE status info • ATLAS: “WLCG IS is a very low priority component” • “We consider that fact that all information in the BDII has to be continually, actively published to be a serious design flaw of the system and it's fundamentally why we are unable to rely on it for ATLAS operations.” • CMS: no official statement • LHCb: seem interested in information consolidation. Consider splitting of dynamic and static information a priority. WLCG Information Officer - 5
What will I be doing? • Listen to customers: the experiments • Collected queries: semi-static information mostly needed • Support for fail-over and caching in services (ala FTS): WMS • Proposal to improve stability of IS: OK! - See deployment • No Glue 2.0 in short-medium term plans. WLCG Information Officer - 6
What will I be doing? • Listen to customers: the sites • No explicit input yet • In general demand for easier deployment and information feeding • Would like to avoid feeding resource information to experiment specific data collectors • Will launch a query on LCG-ROLLOUT • Involvement in top level BDII deployment proposal? • Following reported problems WLCG Information Officer - 7
What will I be doing? • Listen to customers: the developers • EMI services. To do: talk to project teams. • Storage Info providers, software subsystem, CE voview. Who is responsible? • What else outside EMI scope ? • Define what is really needed • Can some of the subsystems disappear? WLCG Information Officer - 8
What will I be doing? • Listen to customers: other consumers • Monitoring/accounting tools, gstat, management • Accuracy of published information • On-going definition of essential objects • Did we miss anybody? WLCG Information Officer - 9
Some planning • Focus on • Compilation of a WLCG usage profile • A first draft approved by the experiments to be ready by end of January 2011 • This will be taken as reference for information consolidation and prioritization of activities • Risks : • not fully descriptive – Please help! • frequent reviews needed. • Working group on evolution of WLCG IS? • Continuing investigations Listen to customers D R A F T P R O P O S A L WLCG Information Officer - 10
Deployment • Proposal presented at the MB on September 28, 2010: “Deployment Strategy for Top-Level BDII” • Query Load • High CPU capacity (150MB info) • Network Latency • It affects query response time and reliability • Query “close” instance • Support • Critical service: monitoring, reliable hardware setup • Releases must be carefully followed. At least 2 Top Level BDII instances per continent: • Triumf and BNL or FNAL • CERN, RAL, KIT, PIC, CNAF, NIKHEF • Taiwan and KEK or Tokyo WLCG Information Officer - 11
Some planning • Focus on • Top level BDII deployment strategy • To be defined and discussed with all involved parties (EGI, OSG, T1s) • Failover strategy based on basic network topology/performance? • First detailed draft by February 2011 ? • Participation in OSG working group to define requirements for an OSG Top level BDII. • Report available. • Presented/discussed on November 9, 2010 at FNAL. Deployment D R A F T P R O P O S A L WLCG Information Officer - 12
Some planning • Focus on • Information cleaning • Removal of deprecated GlueLocation object replicated per tag published by the subcluster – responsible for BDII pollution! • GGUS ticket # 63478 • Storage Information Providers accuracy • Looking into gstat • Focusing on Tier-2resources • Developing parsing tools (to be fed into gstat) • Installed Capacity • Share • CPU scaling factors Coherency and accuracy D R A F T P R O P O S A L WLCG Information Officer - 13
How would I interact with other involved parties? • Experiments • Identification of IS related people • Sent e-mails to individuals; personal discussions • Mailing list? • Statement on WLCG IS usage • Developers • Within EMI : need to establish an official communication channel? What can WLCG ask? How can the original plans be modified? What is the expected response time? • Other subsystems not part of EMI. Need explicit WLCG commitment. • Storage Information providers (castor, xroot, bestman) • lcg-info-dynamic-scheduler • Installed software publishing • Use GGUS SU WLCG Information Officer - 14
How would I interact with other involved parties? • Sites • Use LCG-Rollout • Subscribed to GGUS relevant SUs • Direct contacts in case of problems • Spread the news that the WLCG IO is there ;-) • Other Consumers • Mainly GGUS, e-mail exchange with individuals • Report to WLCG MB/GB for further input? WLCG Information Officer - 15
How would I interact with other involved parties? • Operations • EGI: Initial discussions with Chief Operations Officer (T. Ferrari). Need to establish communication channels and procedures • OSG : Working group established. Communications channels to be defined. • NDGF: Contacts with the NDGF’s contact for the CERN experiments (O. Smirnova). Established channel for important annoucements. WLCG Information Officer- 16
Conclusions • WLCG Information System • Not simply a BDII • WLCG Information Officer • Contact person to help improve reliability of the service and information provided • Interacting with experiments, sites, EMI, EGI, OSG, NDGF, WLCG management • First planning • Use case collection • Top Level BDII deployment strategy • Information consolidation and cleaning • Initial communication channels established • Questions? Send e-mail to Flavia.Donno@cern.ch WLCG Information Officer- 17