340 likes | 472 Views
Ian Bird, CERN Rob Gardner, University of Chicago. Grid Middleware & TOOLS session summary. Introduction. 82 abstracts submitted, 36 oral presentations (7 sessions), 44 posters, [2 withdrawn] Categories: cover a broad range Experiment experiences Data Management Workload Management
E N D
Ian Bird, CERN Rob Gardner, University of Chicago Grid Middleware & TOOLS session summary
Introduction • 82 abstracts submitted, • 36 oral presentations (7 sessions), 44 posters, [2 withdrawn] • Categories: cover a broad range • Experiment experiences • Data Management • Workload Management • Monitoring, Information, Accounting • Security & Authorization • Fabric & Deployment
D0 – reprocessing on OSG Amber Boehnlein Common theme: making sites reliable requires debugging sites/systems one by one
Job agents – pilot jobs Monitoring Alien grid environment - Pablo Saiz
SRM v2.2 – Flavia Donno 18 month effort to agree, build, test, deploy new version
dCache – one of several MSS systems • Patrick Fuhrmann – overview of dCache developments • - Gerd Behrmann – distributed instance for NDGF
LCG Data management tools LFC, DPM, FTS – Markus Schulz
Examples of services that consider deployment & management issues
CORAL – distributed database access Dirk Duellmann
Pilot jobs – and variants: Such a good idea – everyone wants one …
Stuart Paterson – optimizations in DIRAC Marianne Bargiotti Integrity checking in DIRAC
Pilots can move intelligence into the jobPaul Nilsson – Panda experience
gLite WMS developments Marco Cecchi
Igor Sfiligoi – comparison of WMS CHEP'07, Victoria
Experiment dashboards Julia Andreeva Monitoring from VO/user perspective
GridICE – monitoring Guido Cuscela Permits different views of running jobs
James Casey Advances in monitoring of grid services
Stephen Burke – 6 years experience with GLUE schema Martin Flechl – details on integration of information systems
David Groep - glExec Supporting pilot jobs
Greig Cowan Using DPM over the WAN
Addressing failover for core operations services – Alfredo Pagano Various strategies
Platform LSF – Robert Stober Integrating heterogeneous clusters
Observations • Solutions exist for most needs now – • Certainly not all perfect yet • Experiment layer relatively deep • Plethora of workload management systems • Not so many for data management … • Service management issues starting to be addressed by some services (DPM, LFC, FTS, Gridsite, Coral) • But in general little thought on how site managers should manage services • Interoperability / interoperation
Observations • Workload management • Everyone wants pilot (aka glidein) jobs (and everyone has written a system to submit them) • Commonality – to reach a reliable service experiments need to systematically debug sites being used: • D0, CMS, dashboards, … • Sophisticated systems to monitor, debug, recover • Dirac, dashboards, grid service monitoring, etc., • To improve reliability and help debug the system