340 likes | 359 Views
Dive into the world of grid computing at the LIGO LSC DataGrid Workshop in March 2005. Explore topics such as "the Grid," key players, and the functionality of the LSG DataGrid. Get hands-on experience and learn about the latest advancements. The workshop covers agenda details, security, data and job management, and more. Discover the bio-imperatives and be prepared for dynamic changes in grid infrastructure standards. Gain insights from experts and learn about tools like PyGlobus, Condor, iVDGL, GriPhyN, and VDT.
E N D
LIGO LSC DataGrid Workshop March 24-26, 2005 Livingston Observatory
Part One: Introduction • A: Workshop Agenda and Pragmatics • B: Defining “the Grid” • C: Who’s Who in the Grid World • D: Overview of the LSG DataGrid • E: Lab 1: Getting Started
Workshop Agenda • Thursday, March 24 • Introduction • Grid Security • Data Management • Friday, March 25 • Job Management • Workflow Management • MyProxy (Coming Attractions!) • Saturday, March 26 • Local Presentations
Preparation for the Labs • We assume a RedHat 9 installation— • Although it’s not impossible that other platforms may work just as well. • We’ll assume you’ve installed LSC DataGrid Client Toolkit. • We assume your security credentials are already in place.
Bio-Imperatives • Food • Lunches • Dinner • Plumbing
Temporal Disclaimer • The state of the art is: the art is always changing. • Grid infrastructure standards are, however, firming up. • For the most part, we’re going to be talking about how things work at the moment. • We’ll warn you when we go into Coming Attractions mode.
Who Are Those Guys? • GRIDS Center • David Gehrig, NCSA-UIUC • Mike Freemon, NCSA-UIUC • Jaime Frey, University of Wisconson—Madison
“Grid” • Buzzword of the year(s). • In enterprise computing, different meanings at different times. • It often simply means “cluster computing.” • In research, it usually means…
Definition: 1998 “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” Ian Foster and Carl Kesselman: The Grid: Blueprint for a New Computing Infrastructure
Definition: 2002 “A Grid is a system that • coordinates resources that are not subject to centralized control • using standard, open, general-purpose protocols and interfaces • to deliver nontrivial qualities of service.” Ian Foster, ANL: What is the Grid? A Three-Point Checklist
A Working Definition • A distributed computing environment that coordinates • Computational jobs • Data placement • Information management • Scales from one computer to thousands • Capable of working across many administrative domains
National Middleware Initiative • Middleware: an evolving layer of services that resides between the network and more traditional applications for managing security, access, and information exchange • www.nsf-middleware.org • Funds GRIDS Center • Funds Open Grid Computing Environment
GRIDS Center • www.grids-center.org • Grid Research Integration, Deployment, and Support Center • Mission: making grid technology deployable and useful outside the development labs • Packaging • Education
The Globus Alliance • www.globus.org • Creates core infrastructure services • Sponsors include: • DARPA, DoE, NSF, NASA • e-Science (UK), Vetenskapsrådet (Sweden), KTH (Royal Institute of Technology, Stockholm) • IBM, Microsoft Research, Cisco Systems
Globus: Participating Institutions • Argonne National Laboratories • Information Sciences Institute/USC • University of Chicago • University of Edinburgh (UK) • Center for Parallel Computers (Sweden) • “Globus Academic Affiliates”
Globus Toolkit: GT3 • Software services and libraries • Resource monitoring, discovery, and management • Security • File management • Note! GT4: Expected release sixth quarter of 2004
PyGlobus • www-itg.lbl.gov/gtg/projects/pyGlobus/ • Lawrence Berkeley National Laboratory • An interface to the Globus toolkit using the Python scripting language
Condor • A serial/parallel job management system for a pool of compute nodes: • job queueing mechanism, scheduling policy, priority scheme, resource monitoring, and resource management. • Can be used with Globus Toolkit • www.cs.wisc.edu/condor/ • We’ll use “local Condor” and Condor-G
iVDGL:International Virtual Data Grid Laboratory • www.ivdgl.org • Goals • Deploy a Grid laboratory • Use Grid software tools in experiments • Support delivery of Grid technologies • Education and outreach • iVDGL pacman and VDT • LSC is an active participant
GriPhyN: Grid Physics Network • www.griphyn.org • Coalesced around four experiments • Compact Muon Solenoid and ATLAS (“A Toroidal LHC ApparatuS”) at LHC/CERN • Laser Interferometer Gravitational-wave Observatory • Sloal Digital Sky Survey • Petabytes of data annually
VDT: Virtual Data Toolkit • www.cs.wisc.edu/vdt/ • Goal: to make it as easy as possible for users to deploy, maintain and use grid middleware • Initially developed by GriPhyN and iVDGL • Now includes LHC Computing Grid (LCG) and Physics Particle Data Grid (PPDG).
VDT: Components • Basic Grid Services • Condor, Globus • Virtual Data Tools • Virtual Data System • Utilities • Such as GSI-OpenSSH
What is the LSC DataGrid? • A collection of LSC computational and storage resources… • … linked through Grid middleware… • … into a uniform LSC data analysis environment.
LSC DataGrid Sites • Tier 1: CalTech • Tier 2: UWM and PSU • Tier 3: UT-Brownsville and Salish Kootenai College (SKC) • Linux clusters at GEO sites Birmingham, Cardiff and the Albert Einstein Institute (AEI) • LDAS instances at Caltech, MIT, PSU, and UWM
For this Workshop • LSC DataGrid Sites • ldas-grid.ligo.caltech.edu • ldas-grid.ligo-wa.caltech.edu • ldas-grid.ligo-la.caltech.edu • We’ll use ldas-grid.ligo-la.caltech.edu as our head node • Full list of LSC DataGrid resources at www.lsc-group.phys.uwm.edu/lscdatagrid/resources • More discussion of LSC DataGrid later
Lab 1 — Getting Started This lab will verify: • Your software is installed correctly • Your sacrifices have pleased the webgod Ping • Your security credential (i.e. proxy certificate) is okay • Your environment variables won’t suddenly go away
Credits • Some slides in this presentation were adapted from presentations from • GryPhyN Grid Summer Workshop 2004 • The Globus Consortium