180 likes | 195 Views
Explore the architecture, mechanisms, conclusions, and requests learned from data management in the EU DataGrid project. Discover the challenges and solutions in I/O and storage, catalogs, and information services.
E N D
Lessons learned from Data Management in the EU DataGrid Peter Kunszt CERN IT/DB EU DataGrid Data Management Peter.Kunszt@cern.ch Openlab 17.3.2003
Outline • The EU DataGrid • Data Management Architecture • Mechanisms used, conclusions and requests Openlab 17.3.2003
EDG overview : goals • DataGrid is a project funded by European Union whose objective is to exploit and build the next generation computing infrastructure providing intensive computation and analysis of shared large-scale databases. • Enable data intensive sciences by providing world wide Grid test beds to large distributed scientific organisations ( “Virtual Organisations, VOs”) • Start ( Kick off ) : Jan 1, 2001 End : Dec 31, 2003 • Applications/End Users Communities : HEP, Earth Observation, Biology • Specific Project Objetives: • Middleware for fabric & grid management • Large scale testbed • Production quality demonstrations • Contribute to Open Standards and international bodies ( GGF, Industry&Research forum) Openlab 17.3.2003
EDG overview : Main Partners • CERN – International (Switzerland/France) • CNRS - France • ESA/ESRIN – International (Italy) • INFN - Italy • NIKHEF – The Netherlands • PPARC - UK Openlab 17.3.2003
EDG overview : Assistant Partners • Industrial Partners • Datamat (Italy) • IBM-UK (UK) • CS-SI (France) • Research and Academic Institutes • CESNET (Czech Republic) • Commissariat à l'énergie atomique (CEA) – France • Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA SZTAKI) • Consiglio Nazionale delle Ricerche (Italy) • Helsinki Institute of Physics – Finland • Institut de Fisica d'Altes Energies (IFAE) - Spain • Istituto Trentino di Cultura (IRST) – Italy • Konrad-Zuse-Zentrum für Informationstechnik Berlin - Germany • Royal Netherlands Meteorological Institute (KNMI) • Ruprecht-Karls-Universität Heidelberg - Germany • Stichting Academisch Rekencentrum Amsterdam (SARA) – Netherlands • Swedish Research Council - Sweden Openlab 17.3.2003
EDG overview : structure , work packages • The EDG collaboration is structured in 12 Work Packages • WP1: Work Load Management System • WP2: Data Management • WP3: Grid Monitoring / Grid Information Systems • WP4: Fabric Management • WP5: Storage Element • WP6: Testbed and demonstrators • WP7: Network Monitoring • WP8: High Energy Physics Applications • WP9: Earth Observation • WP10: Biology • WP11: Dissemination • WP12: Management } Applications Openlab 17.3.2003
Grid Data Management Dependencies Media Hardware Operating System Local File System Network Software Protocols Storage System Performance Reliability Availability Usability Openlab 17.3.2003
Replica Catalog GDMP Computing Element NFS WN WN WN WN edg-replica-manager WN WN WN WN WN WN WN WN WN WN WN WN RFIO EDG Architecture v1.x User Interface GridFTP server Storage Element Staging daemon Castor Openlab 17.3.2003
I/O and Storage • Modes of I/O on the EDG testbed • NFS mounted • RFIO (Castor) • GridFTP • Mass Storage • Castor Openlab 17.3.2003
I/O and Storage • Conclusions / shortcomings • NFSv2 on Linux not really suitable • Does not scale to large Fabrics • Cannot access remote files • No proper security mapping • RFIO needs work • Security and user control is not suitable for Grid users • Remote I/O also has security issues • Not standard, i.e. needs to be especially deployed at Grid sites • GridFTP • Buggy protocol • Compatibility issues between versions • Only one implementation Openlab 17.3.2003
I/O and Storage Request for • I/O level • fine grained access control lists • a wide variety of protocols • a wide variety of authentication, authorization and policy layers • Storage management level • data pinning and lifetime management • space reservation capabilities • transparent mass storage bindings • inter-storage copy and communication Openlab 17.3.2003
Catalogs Replica Catalog: Storing logical to physical name mappings • EDG used the Globus Replica-Catalog: • LDAP-based • Single point of access • Logical Name scheme is bound to Physical Name • Conclusions • Such a solution does not scale for file catalogs, i.e. LDAP-based solutions are not suitable • Users did not like the Logical Name being restricted • Request for • Fine grained access control of catalog data • Consistency checking in catalog • Pre-registration RLS, RMC to address most issues Openlab 17.3.2003
Catalogs Information Services: storing service status information • EDG used Globus-MDS (Meta-computing Directory Service) • distributed LDAP with a given schema • local information services and global indices • Conclusions • too many synchronization problems • not scalable enough • insufficient caching mechanisms • Request for • Robust up-to-date information service in general • Management layer (schema evolution, ACL) • Different capabilities for different kind of information (location info, archived info, statistics, tickers) R-GMA to address most issues Openlab 17.3.2003
More Lessons Learned: Manageability • Virtual Organization management: • The user base of a VO was managed in EDG through a single LDAP catalog. • VO membership needs to be properly exposed/interpreted by all services, applying VO and site policies • The administration of the VO catalog needs to be simplified, better ease of use. • VOMS = Virtual Organization Membership Service • To address most of these issues, first version to be deployed this year Openlab 17.3.2003
More Lessons Learned: Security • Security Infrastructure is a hard problem • Was not properly tackled by EDG. • GSI is a means to authenticate but not to authorize • Issues • Delegation of rights to services • Service certificates • Automatic renewal of certificates • Kerberos tickets from certificates • User’s private keys VOMS to address some issues, extended capabilities of services – new EDG security design Openlab 17.3.2003
More Lessons Learned:Generic issues • No clear concept in Grid community how to deal with data access and storage • Grid Database bindings are not well tested yet. • There is a clear drive for common interfaces between components – Open Grid Service Architecture effort. • Service discovery and service monitoring architecture not well defined yet. Openlab 17.3.2003
And let’s not forget: Organization and Sociology • The Grid is an inherently distributed environment also in this sense. • Reaching agreements is hard work. • Common Timeline • Common Interfaces • Common Procedures • Common Policies • Definition of supported hardware • Common User support structure • Forcing solutions down people’s throats does not work. • Diversity of local policies is too large, but needs to be accomodated. Openlab 17.3.2003
Summary and Outlook • Our first experimental phase of Grids is almost over • We now have some experience with trying to run a production Grid and know its biggest deficiencies • Components causing the most trouble are being replaced now. • New experience will be gained with LCG-1 in the second half of this year. • There are a lot of opportunities for openlab! Openlab 17.3.2003