360 likes | 451 Views
Ian Willers , Koen Holtman, Frank van Lingen, Heinz Stockinger Caltech, CERN, Eindhoven University of Technology, University of West England, University of Vienna. Computing and Data Management for CMS in the LHC Era. Overview CMS Computing and Data Management CMS Grid Requirements
E N D
Ian Willers, Koen Holtman, Frank van Lingen, Heinz Stockinger Caltech, CERN, Eindhoven University of Technology, University of West England, University of Vienna Computing and Data Management for CMS in the LHC Era
Overview CMS Computing and Data Management • CMS Grid Requirements • CMS Grid work - File Replication • CMS Data Integration
CERN Data Handling and Computation for Physics Analysis event filter (selection & reconstruction) detector processed data event summary data raw data batch physics analysis event reprocessing analysis objects (extracted by physics topic) event simulation interactive physics analysis
The LHC Detectors CMS ATLAS Raw recording rate 0.1 – 1 GB/sec 3.5 PetaBytes / year ~108 events/year LHCb
HEP Computing Status • High Throughput Computing • throughput rather than performance • resilience rather than ultimate reliability • long experience in exploiting inexpensive mass market components • management of very large scale clusters is a problem • Mature Mass Storage model • data resides on tape – cached on disk • light-weight private software for scalability, reliability, performance • PetaByte scale object persistency/database products
CPU Servers Disk Servers
CERN – Tier 0 2.5 Gbps IN2P3 622 Mbps RAL FNAL Tier 1 155 mbps 155 mbps 622 Mbps Uni n Lab a Tier2 Uni b Lab c Department Desktop Regional Centres – a Multi-Tier Model
CERN – Tier 0 IN2P3 2.5 Gbps 622 Mbps DHL RAL FNAL Tier 1 155 mbps Uni n 155 mbps Lab a Tier2 622 Mbps Uni b Lab c Department Desktop More realistically - a Grid Topology
Lab m Uni x regional group CERN Tier 1 Uni a UK USA Lab a France The LHC Computing Centre Tier 1 Uni n CERN Tier2 ………. Italy Lab b Germany ………. Lab c Uni y Uni b physics group
Overview CMS Computing and Data Management • CMS Grid Requirements • CMS Grid work - File Replication • CMS Data Integration
What is the GRID • The word ‘GRID’ has entered the buzzword stage where it has lost any meaning • Everybody is re-branding everything to be a ‘Grid’ (Including us) • Historically: term ‘grid’ invented to denote a hardware/software system in which CPU power in diverse locations is made available easily in a universal way • Getting CPU power as easy as getting power out of a wall-socket (comparison to power grid) • ‘Data Grid’ later coined to describe system in which access to large volumes of data is as easy
What does it do for us? • CMS uses distributed hardwareto do computingnow and in future • We need to get the software to make this hardware work • The interest in ‘grid’ is leading to a lot of outside software we might be able to use • We have now specific collaborations between CMS people (and other data intensive science) and Grid people (computer scientists) to develop grid software more specifically tailored to our needs • In the end, operating our system is our problem
Submit GriPhyN proposal, $12.5M Q2 00 Q3 00 GriPhyN approved, $11.9M Q4 00 Outline of US-CMS Tier plan DTF approved? EU DataGrid approved, $9.3M Caltech-UCSD install Proto-T2 Submit DTF proposal, $45M 2nd Grid coordination meeting Submit PPDG proposal, $12M Submit iVDGL preproposal iVDGL approved? 1st Grid coordination meeting Q1 01 Q2 01 Submit iVDGL proposal, $15M PPDG approved, $9.5M Q3 01 Install initial Florida proto-T2 Grid Projects Timeline Good potential to get useful software components from these projects, BUT this requires a lot of thought and communication on our part
Services • Provided by CMS • Mapping between objects and files (persistency layer) • Local and remote extraction and packaging of objects to/from files • Consistency of software configuration for each site • Configuration meta-data for each sample • Aggregation sub-jobs • Policy for what we want to do (e.g. priorities for what to run “first”, the production manager) • Some error recovery too… • Not needed from anyone • Auto-discovery of arbitrary identical/similar samples • Needed from somebody • Tool to implement common CMS configuration on remote sites ? • Provided by the Grid • Distributed job scheduler: if a file is remote the Grid will run appropriate CMS software (often remotely; split over systems) • Resource management, monitoring, and accounting tools and services EXPAND • Query estimation toolsWHAT DEPTH? • Resource optimisation … with some user hints / control (coherent management of local “copies”, replication, caching…) • Transfer of collections of data • Error recovery tools (from e.g. job/disk crashes….) • Location information of Grid-managed files • File management such as creation, deletion, purging, etc. • Remote “virtual login” and authentication / authorisation
ORCA objects Remote grid storage GDMP Local federation files Grid site Current ‘Grid’ of CMS • We are currently operating software built according to this model in CMS distributed production • Production manager tells GDMP to stage data, then invokes ORCA/CARF (maybe via local job queue) • ORCA uses Objectivity to read/write objects Productionmanager .orcarc and other ORCA config maybe via local job queue Build Import request list (filenames)
A single CMS data grid job 2003 CMS data grid system vision
Objects and files • CMS computing is object-oriented, database oriented • Fundamentally we have a persistent data model with 1 object = 1 piece of physics data (KB-MB size) • Much of the thinking in the Grid projects and Grid community is file oriented • `Computer center' view of large applications • Do not look inside application code • Think about application needs in terms of CPU batch queues, disk space for files, file staging and migration • How to reconcile this? • CMS requirements 2001-2003: • Grid project components do not need to deal with objects directly • Specify file handling requirements in such a way that a CMS layer for object handling can be built on top • Risky strategy but seemed only way to move forward
Relevant CMS documents • Main Grid requirements document:CMS Data Grid System Overview and Requirements. CMS Note 2001/037. http://kholtman.home.cern.ch/kholtman/cmsreqs.ps , .pdf • Official hardware details:CMS Interim Memorandum of Understanding: The Costs and How They are Calculated.CMS Note 2001/035. • Workload model:HEPGRID2001: A Model of a Virtual Data Grid Application. Proc. of HPCN Europe 2001, Amsterdam, p. 711-720, Springer LNCS 2110. http://kholtman.home.cern.ch/kholtman/hepgrid2001/ • Workload model in terms of files: to be written • Shorter term requirements: many discussions and answers to questions in e-mail archives (EU DataGrid in particular) • CMS computing milestones: relevant, butno official reference to a frozen version
Overview CMS Computing and Data Management • CMS Grid Requirements • CMS Grid work - File Replication • CMS Data Integration
Introduction • Replication is well known in distributed systems and important for Data Grids • main focus on High Energy Physics community • sample Grid application • distributed computing model • European DataGrid Project • file replication tool (GDMP) already in production • based on Globus Toolkit • scope is now increased: • Replica Catalog, GridFTP, preliminary mass storage support • functionality is still extensible to meet future needs • GDMP: one of main software systems for EU DataGrid testbed
Globus Replica Catalog • intended as fundamental building block • keeps track of multiple physical files (replicas) • mapping of a logical to several physical files • catalog contains three types of objects: • collection • location • logical file entry • catalog operations like insert, delete, query • can be used directly on the Replica Catalog • or with replica management system
GridFTP • Data transfer and access protocol for secure and efficient data movement • extends the standard FTP protocol • Public-key-based Grid Security Infrastructure (GSI) or Kerberos support (both accessible via GSI-API) • Third-party control of data transfer • Parallel data transfer • Striped data transfer Partial file transfer • Automatic negotiation of TCP buffer/window sizes • Support for reliable and re-startable data transfer • Integrated instrumentation, for monitoring ongoing transfer performance
Grid Data Mirroring Package • General read-only file replication system • subscription - consumer/producer - on demand replication • several command line tools for automatic replication • now using Globus replica catalog • replication steps: • pre-processing: file type specific • actual file transfer: needs to be efficient and secure • post-processing: file type specific • insert into replica catalog: name space management
GDMP Architecture Request Manager Security Layer Replica Catalog Service Data Mover Service Storage Manager Service
Replica Catalog Service • Globus replica catalog (RC) for global file name space • GDMP provides a high-level interface on top • new file information is published in the RC • LFN, PFN, file attributes (size, timestamp, file type) • GDMP also supports: • automatic generation of LFNs & user defined LFNs • clients can query RC by using some filters • currently use a single, central RC (based on LDAP) • we plan to use a distributed RC system in the future • Globus RC successfully tested at several sites • mainly with OpenLDAP • currently testing Oracle 9i: Oracle Internet Directory (OID)
Data Mover Service • require secure and fast point-to-point file transfer • major performance issues for a Data Grid • layered architecture: high-level functions are implemented via calls to lower level services • GridFTP seems to be a good candidate for such a service • promising results • the service also needs to deal with network failures • use built-in error correction and checksums • restart option • we will further explore “pluggable” error handling
Storage Management Service • use external tools for staging (different for each MSS) • we assume that each site has a local disk pool = data transfer cache • currently, GDMP triggers file staging to the disk pool • if a file is not located on the disk pool but requested by a remote site GDMP, initiates a disk-to-disk file transfer • sophisticated space allocation is required (allocate_storage(size)) • the RC stores file locations on disk and default location for a file is on disk • similar to Objectivity - HPSS; different in Hierarchical Resource Manager (HRM) by LBNL • plug-ins to HRM (based on CORBA communication)
References • GDMP has been enhanced with more advanced data management features • http://cmsdoc.cern.ch/cms/grid • further development and integration for a DataGrid software milestone are under way • http://www.eu-datagrid.org • object replication prototype is promising • detailed study of GridFTP shows good performance • http://www.globus.org/datagrid
Overview CMS Computing and Data Management • CMS Grid Requirements • CMS Grid work - File Replication • CMS Data Integration
CRISTAL: Movement of data, production specifications - Regional Centres and CERN Detector parts are transferred from one Local Centre to another, all data associated with the part must be transferred to the destination Centre.
Motivation • Currently many construction databases (one object oriented) and ASCII files (XML) • Users generate XML files • Users want XML from data sources • Collection of sources: OO, Relational, XML files • Users not aware of sources (location, underlying structure and format) • One query language to access data sources • Databases and sources distributed
Query Engine Serialize Serialize Translation Translation Xquery Query engine Extended Xquery Source schema WAN Query engine Construction DB V2 Construction DB V1 Object oriented XML
References • http://fvlingen.home.cern.ch/fvlingen/articles.html • Seamless Integration of Information( Feb 2000) CMS Note 2000/025 • XML interface for object oriented databases in proceedings of ICEIS 2001 • XML for domain viewpointsin proceedings of SCI 2001 • The Role of XML in the CMS Detector Description Databaseto be published in proceedings of CHEP 2001