CHEP04 Track4: Distributed Computing Services

CHEP04Track4: Distributed Computing Services Summary of the parallel session “Distributed Computing Services” Massimo Lamanna / CERN, October 1st 2004

Parallel sessions • Monday • 12 contributions • Main focus: data management • Wednesday • 8 contributions • Main focus: middleware • Wednesday “Special Security Session” • 10 contributions • Running in parallel with the “middleware” track • Summary transparencies from Andrew McNab • Thursday • 12 contributions • Main focus: monitor and workload

Monday • [142] Don Quijote - Data Management for the ATLAS Automatic Production System by Mr. BRANCO, Miguel • [190] Managed Data Storage and Data Access Services for Data Grids by ERNST, Michael • [204] FroNtier: High Performance Database Access Using Standard Web Components in a Scalable Multi-tier Architecture by PATERNO, Marc • [218] On Distributed Database Deployment for the LHC Experiments by DUELLMANN, Dirk • [253] Experiences with Data Indexing services supported by the NorduGrid middleware by SMIRNOVA, Oxana • [278] The Evolution of Data Management in LCG-2 by Mr. BAUD, Jean-Philippe • [328] The Next Generation Root File Server by Mr. HANUSHEVSKY, Andrew • [334] Production mode Data-Replication framework in STAR using the HRM Grid by Dr. HJORT, Eric • [345] Storage Resource Managers at Brookhaven by RIND, Ofer • [392] File-Metadata Management System for the LHCb Experiment by Mr. CIOFFI, Carmine • [414] Data Management in EGEE by NIENARTOWICZ, Krzysztof • [460] SAMGrid Integration of SRMs by Dr. KENNEDY, Robert

Wednesday • [383] Experience with POOL from the first three Data Challenges using the LCG by GIRONE, Maria • [247] Middleware for the next generation Grid infrastructure by LAURE, Erwin • [184] The Clarens Grid-enabled Web Services Framework: Services and Implementation by STEENBERG, Conrad • [305] First Experiences with the EGEE Middleware by KOBLITZ, Birger • [430] Global Distributed Parallel Analysis using PROOF and AliEn by RADEMAKERS, Fons • [162] Software agents in data and workflow management by BARRASS, T A • [500] Housing Metadata for the Common Physicist Using a Relational Database ST. DENIS, Richard • [196] Lattice QCD Data and Metadata Archives at Fermilab and the International Lattice Data Grid by NEILSEN, Eric • [536] Huge Memory systems for data-intensive science by MOUNT, Richard

Wednesday (Special Security Session) • [224] Evaluation of Grid Security Solutions using Common Criteria by NAQVI, SYED • [463] Mis-use Cases for the Grid by SKOW, Dane • [164] Using Nagios for intrusion detection by CARDENAS MONTES, Miguel • [189] Secure Grid Data Management Technologies in ATLAS by BRANCO, Miguel • [249] The GridSite authorization systemby MCNAB, Andrew • [439] Building Global HEP Systems on Kerberos by CRAWFORD, Matt • [104] Authentication/Security services in the ROOT framework by GANIS, Gerardo • [122] A Scalable Grid User Management System for Large Virtual Organizations by CARCASSI, Gabriele • [191] Virtual Organization Membership Service eXtension (VOX) by FISK, Ian • [194] G-PBox: a policy framework for Grid environments by RUBINI, Gianluca

Thursday • [69] Resource Predictors in HEP Applications by HUTH, John • [318] The STAR Unifid Meta-Scheduler project, a front end around evolving technologies for user analysis and data production. by LAURET, Jerome • [321] SPHINX: A Scheduling Middleware for Data Intensive Applications on a Grid by CAVANAUGH, Richard • [417] Information and Monitoring Services within a Grid Environment by WILSON, Antony • [420] Pratical approaches to Grid workload and resource management in the EGEE projectby SGARAVATTO, Massimo • [490] Grid2003 Monitoring, Metrics, and Grid Cataloging System by MAMBELLI, Marco; KIM, Bockjoo Kim • [89] MonALISA: An Agent Based, Dynamic Service System to Monitor, Control and Optimize Grid based Applications. by LEGRAND, Iosif • [274] Design and Implementation of a Notification Model for Grid Monitoring Events by DE BORTOLI, Natascia • [338] BaBar Book Keeping project - a distributed meta-data catalog of the BaBar event store. by SMITH, Douglas • [388] A Lightweight Monitoring and Accounting System for LHCb DC04 Production by SANCHEZ GARCIA, Manuel • [393] Development and use of MonALISA high level monitoring services for Meta-Schedulersby EFSTATHIADIS, Efstratios • [377] DIRAC - The Distributed MC Production and Analysis for LHCb by TSAREGORODTSEV, Andrei

Structure of the talk • Security • “Data Management” • “Middleware” • “Monitor and Workload” • Conclusions and outlook • I would like to thank the track co-coordinator, Ruth Pordes and all the sessions chairs (Conrad Steenberg, Robert Kennedy, Andrew McNab, Oxana Smironova) • Disclaimer: it was not really useful to list *all* talks. This summary reflects my personal view (and biases) trying to extract key points from all the material shown and discussed in the “Distributed Computing Services” parallel session

Security: Themes • Pre-Grid services like ssh on Grid machines are already under attack! • People are developing tools to look for attacks. • Grids still need to interface to security used by pre-Grid systems like Kerberos, AFS and WWW • We are developing tools to manage 1000s of users in big experiments. • Application-level software developers are starting to interface to Grid security systems. Summary from Andrew McNab

Security: Technologies • Monitoring of local security detection software (Tripwire) etc to Nagios was presented. • VOMS, VOMRS/VOX, GUMS all discussed for distributing authorization information about users • GridSite provides Grid extensions to Apache. • Kerberos sites still need to be supported/included. • Implications of a Web Services future is in everyone's mind... Summary from Andrew McNab

Security: For non-Security people! • Developers: • Most attacks possible because of poor software quality (buffer overflows etc) • Some evidence that stolen Grid credentials have been tried out also: they will go after middleware bugs next • Site administrators: • Local exploits are now really important, not just network exploits (Grids have 1000s of “local” users.) • You will need monitoring to differentiate between “Grid worms” and “Grid jobs” (they look the same!) Summary from Andrew McNab

Data management: Themes • At least three threads: • New tools/services becoming reality • Approaching maturity • Experiments confronted with the existing (and evolving) data management layer • Comparison with the talks presented at CHEP03 very instructive: It looks a lot has been achieved in this field in the last 1 and ½ year!

Data Management: New tool/services becoming reality • Impressive demonstration of the maturity level reached by POOL together with the 3 LHC experiments • 400+ TB, same order of previous exercises using Objectivity/DB • Key ingredients: experience and experiments requirements and pressure • Interplay of data base technology and native grid service for data distribution and replication • FronTier (FNAL, running experiments • Decouple development and user data access • Scalable • Many commodity tools and techniques (SQUID) • Simple to deploy • LCG 3D (CERN, LHC experiments) • Sustainable infrastructure • SAM, BaBar DM • Experience with running experiments • gLite Data management • New technology and experience Convergence foreseen and envisageble

xrootd • Rich but efficient server protocol • Combines file serving with P2P elements • Allows client hints for improved performance • Pre-read, prepare, client access & processing hints, • Multiplexed request stream • Multiple parallel requests allowed per client • An extensible base architecture • Heavily multi-threaded • Clients are dedicated threads whenever possible • Extensive use of OS I/O features • Async I/O, device polling, etc. • Load adaptive reconfiguration. • Key element in the proposal for a Huge-Memory Systems for Data-Intensive Science (R. Mount)

Data management: Approaching maturity… • SRM implementations • Not trivial • But being demonstrated • Great news!

Experiments confronted with the existing (and evolving) data management layer • Cfr. most of the plenary talks, e.g. A. Boehnlein, P. Elmer, D. Stickland, N. Katayama, I. Bird,… • Track 4 talks: • Grids not Grid • Heterogeneity of the grid resources (cfr. ATLAS/Don Quijote) • Independent evolution and experience (Nordugrid) • Production mode (Experiments’ data challenges) • Evolution of LCG2 Data Management

Experiments confronted with the existing (and evolving) data management layer

Middleware: Themes • New generation of middleware becoming available • Some commonality on technology • Service Oriented Architecture; Web Services • gLite (EGEE project) • Web services • GAE • Interactivity as a goal as opposed to “production” mode • RPC-based web service framework (Clarence) • Emphasis on discovery services and high level services (orchestration) • Compatibility with gLite to be explored • DIRAC • XML-RPC: no need for WSDL… • Instant messages protocol inter service/agents communication • Connection based; outbound connectivity only • Interact with other exp. specific services (cfr. File-Metadata Management System For The LHCb Experiment“) • Agent-based system • DIRAC, GAE, PheDEX • (First) feedback coming (developments in the experiments, ARDA)

Middleware • gLite middleware • Lightweight (existing) services • Easily and quickly deployable • Use existing services where possible asbasis for re-engineering • Interoperability • Allow for multiple implementations • Perf/Scale. & Resilience/Fault Tolerance • Large-scale deployment and continuous usage • Portable • Being built on Scientific Linux and Windows • Co-existence with deployed infrastructure • Reduce requirements on participating sites • Flexible service deployment • Multiple services running on the same physical machine (if possible) • Co-existence with LCG-2 and OSG (US) are essential for the EGEE Grid service • Service oriented approach • Follow WSRF standardization • No mature WSRF implementations exist to-date so start with plain WS • WSRF compliance is not an immediate goal, but we follow the WSRF evolution • WS-I compliance is important

Middleware

Middleware: themes • Dynamics of the evolution of the middleware very complex • Experience injected in the projects • Previous/other projects • Experiment contributions • Essential inputs: cfr: CMS TMDB-Phedex presentation • Close feedback loop • ARDA in the case of gLite • Users/DataChallenges • Large(r) user community being exposed

PROOF • Interactive analysis + parallelism (ROOT) • PROOF on the GRID (2003: demo with Alien, gLite end 2004) • PROOF Analysis interface (Portal)

Monitoring systems • Many different monitoring systems used (Ganglia, MDS, GridICE, Monalisa, R-GMA, LHCb DIRAC system, …) • In different combinations on different systems (LCG-2, Grid 2003 gridcat, BNL SUMS, etc…) • Positive point: hybrid systems are possible! • Essential to have “global” views (planning, scheduling,…) • Different systems are capable to coexist (Grid3 uses 3 of them) • Monalisa very widely used • Used in a very large and diversified set of systems (computing fabric, network performance tests, application like VRVS, resource brokering STAR SUMS, security,…). 160+ sites. • Situation is getting clearer at the system level. Less clear (at least to me) for the application monitoring.

Workload management systems • BNL STAR SUMS systems • Emphasis on stability • Running experiment! • Lot of users! • A front end to local and distributed RMS acting like a client to multiple, heterogeneous RMS • A flexible opened architecture, object oriented framework in which with plug-and-play features • A good environment for further developing • Standards (such as High level JDL) • Scalability of other components (ML work, immediate use) • Used in STAR for real Physics (usage and publication list) • Usedfor Distributed / Grid Simulation job submission • Used successfully by other experiments

Workload Management System • EGEE gLite WMS is being released • Evolution of the EDG WMS • Provides both “push” and “pull” modes

Optimisation and accounting • Similar concepts at work in different activites • “Phenomenological” estimates based on few parameters (J. Huth et al.) • Parametrize the application required time as: T = g (a + b * n_events). g contains the CPU power, the compilation flags (optimised/debug), linear with event size. • Warning: in a multiVO multi-user environment the situation could be much more complicated • BNL STAR SUMS: minimize the (estimated) transit time • Observe the TT and act consequently (Uses Monalisa) • Up to now OK only for system out of the saturation zone • Sphinx project (GAE) • EGEE gLite • Inside the WMs • It looks we are approaching the phase where “Grid Accounting” will really distinguish from “Grid Monitoring” and static resource allocation • EGEE gLite (WMS talk) • Relatively easier for a single VO system • LHCb DIRAC accounting system (still a reporting system coupled to the monitor system)

Sphinx Measurements

Pull approach: DIRAC workload management • Realizes PULL scheduling paradigm • Agents are requesting jobs whenever the corresponding resource is free • Using Condor ClassAd and Matchmaker for finding jobs suitable to the resource profile • Agents are steering job execution on site • Jobs are reporting their state and environment to central Job Monitoring service • Averaged 420ms match time over 60,000 jobs • Queued jobs grouped by categories • Matches performed by category • Typically 1,000 to 20,000 jobs queued

Metadata • Considerable efforts going on in the community • The SAM-Grid Team and the Metadata Working Group, EGEE gLite, LCG ARDA • Material from running experiments (notably CDF and BaBar), HEPCAL, LHC experiment … • Cfr. also: “Lattice QCD Data and Metadata Archives at Fermilab and the International Lattice Data Grid” Neilsen/Simone • On the border between “generic” middleware and “experiment-specific” software • Probably there is the need of a generic layer • It will emerge distilling out the experience on the “experiment-specific” side and technology considerations

Metadata: Babar system • Mirroring system in place (heterogeneous technologies in use: mySQL and ORACLE) • Publish/synch system developed in house • Distribution down to users’ laptops

Conclusions and outlook • Experience is still (and will be) more effective that pure technology • Look at the running experiments! • See the powerful boost from the large data challenges! • SRM is the candidate to be a first high-level middleware service • Good news! • Why only SRM? • What about, for example, other data management tools? • Metadata catalogues? • … • The fast evolution show the vitality and enthusiasm of the HEP community • How can we use it to progress even faster? • What should we do to converge on other high-level services? • gLite is a unique opportunity: we should not miss it • Grid as a monoculture is not realistic • A recipe? Some ingredients, at least… • The Physics and Physicists! • Analysis is still somewhat missing. More and broader experience needed • Diverse contributions and technology choices but convergence is possible!

CHEP04 Track4: Distributed Computing Services

CHEP04 Track4: Distributed Computing Services

Presentation Transcript

Distributed computing

Distributed Computing

DISTRIBUTED COMPUTING

Distributed Computing

Distributed Computing

DIRAC Distributed Computing Services

DISTRIBUTED COMPUTING

Distributed Computing

DISTRIBUTED COMPUTING

Distributed Computing

Distributed Computing

DISTRIBUTED COMPUTING

Distributed Computing

DISTRIBUTED COMPUTING

DISTRIBUTED COMPUTING

Distributed Computing

Distributed Computing

Distributed computing

DISTRIBUTED COMPUTING