340 likes | 488 Views
LIGO LSC DataGrid Workshop. March 24-26, 2005 Livingston Observatory. Part One: Introduction. A: Workshop Agenda and Pragmatics B: Defining “the Grid” C: Who’s Who in the Grid World D: Overview of the LSG DataGrid E: Lab 1: Getting Started. A: Workshop Agenda and Pragmatics.
E N D
LIGO LSC DataGrid Workshop March 24-26, 2005 Livingston Observatory
Part One: Introduction • A: Workshop Agenda and Pragmatics • B: Defining “the Grid” • C: Who’s Who in the Grid World • D: Overview of the LSG DataGrid • E: Lab 1: Getting Started
Workshop Agenda • Thursday, March 24 • Introduction • Grid Security • Data Management • Friday, March 25 • Job Management • Workflow Management • MyProxy (Coming Attractions!) • Saturday, March 26 • Local Presentations
Preparation for the Labs • We assume a RedHat 9 installation— • Although it’s not impossible that other platforms may work just as well. • We’ll assume you’ve installed LSC DataGrid Client Toolkit. • We assume your security credentials are already in place.
Bio-Imperatives • Food • Lunches • Dinner • Plumbing
Temporal Disclaimer • The state of the art is: the art is always changing. • Grid infrastructure standards are, however, firming up. • For the most part, we’re going to be talking about how things work at the moment. • We’ll warn you when we go into Coming Attractions mode.
Who Are Those Guys? • GRIDS Center • David Gehrig, NCSA-UIUC • Mike Freemon, NCSA-UIUC • Jaime Frey, University of Wisconson—Madison
“Grid” • Buzzword of the year(s). • In enterprise computing, different meanings at different times. • It often simply means “cluster computing.” • In research, it usually means…
Definition: 1998 “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” Ian Foster and Carl Kesselman: The Grid: Blueprint for a New Computing Infrastructure
Definition: 2002 “A Grid is a system that • coordinates resources that are not subject to centralized control • using standard, open, general-purpose protocols and interfaces • to deliver nontrivial qualities of service.” Ian Foster, ANL: What is the Grid? A Three-Point Checklist
A Working Definition • A distributed computing environment that coordinates • Computational jobs • Data placement • Information management • Scales from one computer to thousands • Capable of working across many administrative domains
National Middleware Initiative • Middleware: an evolving layer of services that resides between the network and more traditional applications for managing security, access, and information exchange • www.nsf-middleware.org • Funds GRIDS Center • Funds Open Grid Computing Environment
GRIDS Center • www.grids-center.org • Grid Research Integration, Deployment, and Support Center • Mission: making grid technology deployable and useful outside the development labs • Packaging • Education
The Globus Alliance • www.globus.org • Creates core infrastructure services • Sponsors include: • DARPA, DoE, NSF, NASA • e-Science (UK), Vetenskapsrådet (Sweden), KTH (Royal Institute of Technology, Stockholm) • IBM, Microsoft Research, Cisco Systems
Globus: Participating Institutions • Argonne National Laboratories • Information Sciences Institute/USC • University of Chicago • University of Edinburgh (UK) • Center for Parallel Computers (Sweden) • “Globus Academic Affiliates”
Globus Toolkit: GT3 • Software services and libraries • Resource monitoring, discovery, and management • Security • File management • Note! GT4: Expected release sixth quarter of 2004
PyGlobus • www-itg.lbl.gov/gtg/projects/pyGlobus/ • Lawrence Berkeley National Laboratory • An interface to the Globus toolkit using the Python scripting language
Condor • A serial/parallel job management system for a pool of compute nodes: • job queueing mechanism, scheduling policy, priority scheme, resource monitoring, and resource management. • Can be used with Globus Toolkit • www.cs.wisc.edu/condor/ • We’ll use “local Condor” and Condor-G
iVDGL:International Virtual Data Grid Laboratory • www.ivdgl.org • Goals • Deploy a Grid laboratory • Use Grid software tools in experiments • Support delivery of Grid technologies • Education and outreach • iVDGL pacman and VDT • LSC is an active participant
GriPhyN: Grid Physics Network • www.griphyn.org • Coalesced around four experiments • Compact Muon Solenoid and ATLAS (“A Toroidal LHC ApparatuS”) at LHC/CERN • Laser Interferometer Gravitational-wave Observatory • Sloal Digital Sky Survey • Petabytes of data annually
VDT: Virtual Data Toolkit • www.cs.wisc.edu/vdt/ • Goal: to make it as easy as possible for users to deploy, maintain and use grid middleware • Initially developed by GriPhyN and iVDGL • Now includes LHC Computing Grid (LCG) and Physics Particle Data Grid (PPDG).
VDT: Components • Basic Grid Services • Condor, Globus • Virtual Data Tools • Virtual Data System • Utilities • Such as GSI-OpenSSH
What is the LSC DataGrid? • A collection of LSC computational and storage resources… • … linked through Grid middleware… • … into a uniform LSC data analysis environment.
LSC DataGrid Sites • Tier 1: CalTech • Tier 2: UWM and PSU • Tier 3: UT-Brownsville and Salish Kootenai College (SKC) • Linux clusters at GEO sites Birmingham, Cardiff and the Albert Einstein Institute (AEI) • LDAS instances at Caltech, MIT, PSU, and UWM
For this Workshop • LSC DataGrid Sites • ldas-grid.ligo.caltech.edu • ldas-grid.ligo-wa.caltech.edu • ldas-grid.ligo-la.caltech.edu • We’ll use ldas-grid.ligo-la.caltech.edu as our head node • Full list of LSC DataGrid resources at www.lsc-group.phys.uwm.edu/lscdatagrid/resources • More discussion of LSC DataGrid later
Lab 1 — Getting Started This lab will verify: • Your software is installed correctly • Your sacrifices have pleased the webgod Ping • Your security credential (i.e. proxy certificate) is okay • Your environment variables won’t suddenly go away
Credits • Some slides in this presentation were adapted from presentations from • GryPhyN Grid Summer Workshop 2004 • The Globus Consortium