410 likes | 532 Views
GriPhyN Project Overview. Paul Avery University of Florida avery@phys.ufl.edu GriPhyN NSF Project Review 29-30 January 2003 Chicago. GriPhyN = Experiments + CS + Grids. GriPhyN = Gri d Phy sics N etwork Computer Scientists ( Globus, Condor, SRB, … )
E N D
GriPhyN Project Overview Paul AveryUniversity of Floridaavery@phys.ufl.edu GriPhyN NSF Project Review29-30 January 2003Chicago
GriPhyN = Experiments + CS + Grids • GriPhyN = Grid Physics Network • Computer Scientists (Globus, Condor, SRB, …) • Physicists from 4 frontier physics/astronomy expts. • GriPhyN basics (2000 – 2005) • $11.9M (NSF) + $1.6M (matching) • 17 universities, SDSC, 3 labs, ~80 people • Integrated Outreach effort (UT Brownsville) • Management • Paul Avery (Florida) co-Director • Ian Foster (Chicago) co-Director • Mike Wilde (Argonne) Project Coordinator • Rick Cavanaugh (Florida) Deputy Coordinator Paul Avery, University of Florida avery@phys.ufl.edu
U Florida U Chicago Boston U Caltech U Wisconsin, Madison USC/ISI Harvard Indiana Johns Hopkins Texas A&M Stanford U Illinois at Chicago U Penn U Texas, Brownsville UC Berkeley U Wisconsin, Milwaukee UC San Diego SDSC Lawrence Berkeley Lab Argonne Fermilab Brookhaven GriPhyN Institutions (Sep. 2000) Funded by GriPhyN Paul Avery, University of Florida avery@phys.ufl.edu
GriPhyN Vision • Create tools to enable collaborative research • Large research teams • … by global scientific communities • International distribution of people and resources • … at petascale levels • PetaOps + PetaBytes + Performance • … in a transparent way • Scientists think in terms of their science Paul Avery, University of Florida avery@phys.ufl.edu
Community growth Data growth GriPhyN Science Drivers • US-CMS & US-ATLAS • HEP experiments at LHC/CERN • 100s of Petabytes • LIGO • Gravity wave experiment • 100s of Terabytes • Sloan Digital Sky Survey • Digital astronomy (1/4 sky) • 10s of Terabytes 2007 2002 2001 • Massive CPU • Large, distributed datasets • Large, distributed communities Paul Avery, University of Florida avery@phys.ufl.edu
GriPhyN Goals • Conduct CS research to achieve vision • Virtual Data as unifying principle • Planning, execution, performance monitoring • Disseminate through Virtual Data Toolkit • Integrate into GriPhyN science experiments • Common Grid tools, services • Impact other disciplines • HEP, biology, medicine, virtual astronomy, eng. • Other Grid projects • Educate, involve, train students in IT research • Undergrads, grads, postdocs, • Underrepresented groups Paul Avery, University of Florida avery@phys.ufl.edu
Goal: PetaScale Virtual-Data Grids Production Team Single Researcher Workgroups Interactive User Tools Request Execution & Management Tools Request Planning &Scheduling Tools Virtual Data Tools ResourceManagementServices Security andPolicyServices Other GridServices • PetaOps • Petabytes • Performance Transforms Distributed resources(code, storage, CPUs,networks) Raw datasource Paul Avery, University of Florida avery@phys.ufl.edu
Korea Russia UK USA Tier2 Center Tier2 Center Tier2 Center Tier2 Center Institute Institute Institute Institute Example: Global LHC Data Grid CMS Experiment Online System 100-200 MBytes/s CERN Computer Center > 20 TIPS Tier 0 2.5 - 10 Gbits/s Tier 1 2.5 Gbits/s Tier 2 ~0.6 Gbits/s Tier 3 1 Gbits/s Physics cache Tier 4 PCs, other portals Paul Avery, University of Florida avery@phys.ufl.edu
GriPhyN Project Challenges • We balance and coordinate • CS researchwith “goals, milestones & deliverables” • GriPhyN schedule/priorities/riskswith those of the 4 experiments • General tools developed by GriPhyNwith specific tools developed by 4 experiments • Data Grid design, architecture & deliverableswith those of other Grid projects • Appropriate balance requires • Tight management, close coordination, trust • We have (so far) met these challenges • But requires constant attention, good will Paul Avery, University of Florida avery@phys.ufl.edu
DOE Science Internet 2 NSF PACIs GriPhyNManagement ArchitectureCarl Kesselman Project Directors Paul Avery Ian Foster External Advisory Committee Industrial Connections Ian Foster / Paul Avery Outreach/Education Manuela Campanelli Project CoordinationMike WildeRick Cavanaugh iVDGLRob Gardner EDG, LCG, Other Grid Projects Physics Experiments Applications Coord.: R. Cavanaugh ATLAS (Rob Gardner) CMS (Rick Cavanaugh) LIGO (Albert Lazzarini) SDSS (Alexander Szalay) VDT Development Coord.: M. Livny Requirements, Definition & Scheduling (Miron Livny) Integration, Testing, Documentation, Support (Alain Roy) Globus Project & NMI Integration (Carl Kesselman) CS Research Coord.: I. Foster Virtual Data (Mike Wilde) Request Planning & Scheduling (Ewa Deelman) Execution Management (Miron Livny) Measurement, Monitoring & Prediction (Valerie Taylor) Inter-Project Coordination: R. Pordes HICB(Larry Price) HIJTB (Carl Kesselman) PPDG(Ruth Pordes) TeraGrid, NMI, etc.(TBD) International (EDG, etc) (Ruth Pordes) iVDGL Paul Avery, University of Florida avery@phys.ufl.edu
External Advisory Committee • Members • Fran Berman (SDSC Director) • Dan Reed (NCSA Director) • Joel Butler (former head, FNAL Computing Division) • Jim Gray (Microsoft) • Bill Johnston (LBNL, DOE Science Grid) • Fabrizio Gagliardi (CERN, EDG Director) • David Williams (former head, CERN IT) • Paul Messina (former CACR Director) • Roscoe Giles (Boston U, NPACI-EOT) • Met with us 3 times: 4/2001, 1/2002, 1/2003 • Extremely useful guidance on project scope & goals Paul Avery, University of Florida avery@phys.ufl.edu
Integration of GriPhyN and iVDGL • International Virtual-Data Grid Laboratory • A global Grid laboratory (US, EU, Asia, …) • A place to conduct Data Grid tests “at scale” • A mechanism to create common Grid infrastructure • A laboratory for Grid tests by other disciplines • Tight integration with GriPhyN • Testbeds • VDT support • Outreach • Common External Advisory Committee • International participation • DataTag (EU) • UK e-Science programme: support 6 CS Fellows Paul Avery, University of Florida avery@phys.ufl.edu
GriPhyN/iVDGL Basics • Both NSF funded, overlapping periods • GriPhyN: $11.9M (NSF) + $1.6M (match) (2000–2005) • iVDGL: $13.7M (NSF) + $2M (match) (2001–2006) • Basic composition • GriPhyN: 12 universities, SDSC, 3 labs (~82 people) • iVDGL: 16 institutions, SDSC, 3 labs (~84 people) • Large overlap: people, institutions, experiments • GriPhyN (Grid research) vs iVDGL (Grid deployment) • GriPhyN: 2/3 “CS” + 1/3 “physics” ( 0% H/W) • iVDGL: 1/3 “CS” + 2/3 “physics” (20% H/W) • Virtual Data Toolkit (VDT) in common • Testbeds in common Paul Avery, University of Florida avery@phys.ufl.edu
iVDGL Institutions U Florida CMS Caltech CMS, LIGO UC San Diego CMS, CS Indiana U ATLAS, iGOC Boston U ATLAS U Wisconsin, Milwaukee LIGO Penn State LIGO Johns Hopkins SDSS, NVO U Chicago CS U Southern California CS U Wisconsin, Madison CS Salish Kootenai Outreach, LIGO Hampton U Outreach, ATLAS U Texas, Brownsville Outreach, LIGO Fermilab CMS, SDSS, NVO Brookhaven ATLAS Argonne Lab ATLAS, CS T2 / Software CS support T3 / Outreach T1 / Labs(not funded) Paul Avery, University of Florida avery@phys.ufl.edu
SKC Boston U Wisconsin Michigan PSU BNL Fermilab LBL Argonne J. Hopkins NCSA Indiana Hampton Caltech Oklahoma Vanderbilt UCSD/SDSC FSU Arlington UF Tier1 FIU Tier2 Brownsville Tier3 US-iVDGL Sites (Spring 2003) Partners? • EU • CERN • Brazil • Australia • Korea • Japan Paul Avery, University of Florida avery@phys.ufl.edu
Korea Wisconsin CERN Fermilab Caltech UCSD FSU Florida FIU Brazil Example: US-CMS Grid Testbed Paul Avery, University of Florida avery@phys.ufl.edu
U.S. Piece US ProjectDirectors International Piece US External Advisory Committee Collaborating Grid Projects GriPhyNMike Wilde US Project Steering Group DataTAG TeraGrid EDG LCG? Asia BTEV ALICE Bio Geo ? Facilities Team D0 PDC CMS HI ? Core Software Team Operations Team Project Coordination Group Applications Team GLUE Interoperability Team Outreach Team iVDGL Management & Coordination Paul Avery, University of Florida avery@phys.ufl.edu
Meetings in 2000-2001 • GriPhyN/iVDGL meetings • Oct. 2000 All-hands Chicago • Dec. 2000 Architecture Chicago • Apr. 2001 All-hands, EAC USC/ISI • Aug. 2001 Planning Chicago • Oct. 2001 All-hands, iVDGL USC/ISI • Numerous smaller meetings • CS-experiment • CS research • Liaisons with PPDG and EU DataGrid • US-CMS and US-ATLAS computing reviews • Experiment meetings at CERN Paul Avery, University of Florida avery@phys.ufl.edu
Meetings in 2002 • GriPhyN/iVDGL meetings • Jan. 2002 EAC, Planning, iVDGL Florida • Mar. 2002 Outreach Workshop Brownsville • Apr. 2002 All-hands Argonne • Jul. 2002 Reliability Workshop ISI • Oct. 2002 Provenance Workshop Argonne • Dec. 2002 Troubleshooting Workshop Chicago • Dec. 2002 All-hands technical ISI + Caltech • Jan. 2003 EAC SDSC • Numerous other 2002 meetings • iVDGL facilities workshop (BNL) • Grid activities at CMS, ATLAS meetings • Several computing reviews for US-CMS, US-ATLAS • Demos at IST2002, SC2002 • Meetings with LCG (LHC Computing Grid) project • HEP coordination meetings (HICB) Paul Avery, University of Florida avery@phys.ufl.edu
Progress: CS, VDT, Outreach • Lots of good CS research (Later talks) • Installation revolution: VDT + Pacman (Later talk) • Several major releases this year: VDT 1.1.5 • VDT/Pacman vastly simplify Grid software installation • Used by all experiments • Agreement to use VDT by LHC Computing Grid Project • Grid integration in experiment s/w (Later talks) • Expanding education/outreach (Later talk) • Integration with iVDGL • Collaborations: PPDG, NPACI-EOT, SkyServer, QuarkNet • Meetings, brochures, talks, … Paul Avery, University of Florida avery@phys.ufl.edu
Progress: Student Participation • Integrated student involvement • CS research • VDT deployment, testing, support • Integrating Grid tools in physics experiments • Cluster building, testing • Grid software deployment • Outreach, web development • Integrated postdoc involvement • Involvement in all areas • Necessary when students not sufficient Paul Avery, University of Florida avery@phys.ufl.edu
Global Context: Data Grid Projects • U.S. Infrastructure Projects • GriPhyN (NSF) • iVDGL (NSF) • Particle Physics Data Grid (DOE) • TeraGrid (NSF) • DOE Science Grid (DOE) • EU, Asia major projects • European Data Grid (EDG) (EU, EC) • EDG related national Projects (UK, Italy, France, …) • CrossGrid (EU, EC) • DataTAG (EU, EC) • LHC Computing Grid (LCG) (CERN) • Japanese Project • Korea project Paul Avery, University of Florida avery@phys.ufl.edu
U.S. Project Coordination: Trillium • Trillium = GriPhyN + iVDGL+ PPDG • Large overlap in leadership, people, experiments • Benefit of coordination • Common S/W base + packaging: VDT + PACMAN • Low overhead for collaborative or joint projects:security, monitoring, newsletter, prod. grids, demos • Wide deployment of new technologies, e.g. Virtual Data • Stronger, more extensive outreach effort • Forum for US Grid projects • Joint view, strategies, meetings and work • Unified entity to deal with EU & other Grid projects • “Natural” collaboration across DOE and NSF projects • Funding agency interest? Paul Avery, University of Florida avery@phys.ufl.edu
GriPhyN = Experiments + CS + Grids • GriPhyN = Grid Physics Network • Computer Scientists (Globus, Condor, SRB, …) • Physicists from 4 frontier physics/astronomy expts. • GriPhyN basics (2000 – 2005) • $11.9M (NSF) + $1.6M (matching) • 17 universities, SDSC, 3 labs, ~80 people • Integrated Outreach effort (UT Brownsville) • Management • Paul Avery (Florida) co-Director • Ian Foster (Chicago) co-Director • Mike Wilde (Argonne) Project Coordinator • Rick Cavanaugh (Florida) Deputy Coordinator Paul Avery, University of Florida avery@phys.ufl.edu
U Florida U Chicago Boston U Caltech U Wisconsin, Madison USC/ISI Harvard Indiana Johns Hopkins Texas A&M Stanford U Illinois at Chicago U Penn U Texas, Brownsville UC Berkeley U Wisconsin, Milwaukee UC San Diego SDSC Lawrence Berkeley Lab Argonne Fermilab Brookhaven GriPhyN Institutions (Sep. 2000) Funded by GriPhyN Paul Avery, University of Florida avery@phys.ufl.edu
GriPhyN Project Challenges • We balance and coordinate • CS researchwith “goals, milestones & deliverables” • GriPhyN schedule/priorities/riskswith those of the 4 experiments • General tools developed by GriPhyNwith specific tools developed by 4 experiments • Data Grid design, architecture & deliverableswith those of other Grid projects • Appropriate balance requires • Tight management, close coordination, trust • We have (so far) met these challenges • But requires constant attention, good will Paul Avery, University of Florida avery@phys.ufl.edu
DOE Science Internet 2 NSF PACIs GriPhyNManagement ArchitectureCarl Kesselman Project Directors Paul Avery Ian Foster External Advisory Committee Industrial Connections Ian Foster / Paul Avery Outreach/Education Manuela Campanelli Project CoordinationMike WildeRick Cavanaugh iVDGLRob Gardner EDG, LCG, Other Grid Projects Physics Experiments Applications Coord.: R. Cavanaugh ATLAS (Rob Gardner) CMS (Rick Cavanaugh) LIGO (Albert Lazzarini) SDSS (Alexander Szalay) VDT Development Coord.: M. Livny Requirements, Definition & Scheduling (Miron Livny) Integration, Testing, Documentation, Support (Alain Roy) Globus Project & NMI Integration (Carl Kesselman) CS Research Coord.: I. Foster Virtual Data (Mike Wilde) Request Planning & Scheduling (Ewa Deelman) Execution Management (Miron Livny) Measurement, Monitoring & Prediction (Valerie Taylor) Inter-Project Coordination: R. Pordes HICB(Larry Price) HIJTB (Carl Kesselman) PPDG(Ruth Pordes) TeraGrid, NMI, etc.(TBD) International (EDG, etc) (Ruth Pordes) iVDGL Paul Avery, University of Florida avery@phys.ufl.edu
External Advisory Committee • Members • Fran Berman (SDSC Director) • Dan Reed (NCSA Director) • Joel Butler (former head, FNAL Computing Division) • Jim Gray (Microsoft) • Bill Johnston (LBNL, DOE Science Grid) • Fabrizio Gagliardi (CERN, EDG Director) • David Williams (former head, CERN IT) • Paul Messina (former CACR Director) • Roscoe Giles (Boston U, NPACI-EOT) • Met with us 3 times: 4/2001, 1/2002, 1/2003 • Extremely useful guidance on project scope & goals Paul Avery, University of Florida avery@phys.ufl.edu
Integration of GriPhyN and iVDGL • International Virtual-Data Grid Laboratory • A global Grid laboratory (US, EU, Asia, …) • A place to conduct Data Grid tests “at scale” • A mechanism to create common Grid infrastructure • A laboratory for Grid tests by other disciplines • Tight integration with GriPhyN • Testbeds • VDT support • Outreach • Common External Advisory Committee • International participation • DataTag (EU) • UK e-Science programme: support 6 CS Fellows Paul Avery, University of Florida avery@phys.ufl.edu
U.S. Piece US ProjectDirectors International Piece US External Advisory Committee Collaborating Grid Projects GriPhyNMike Wilde US Project Steering Group DataTAG TeraGrid EDG LCG? Asia BTEV ALICE Bio Geo ? Facilities Team D0 PDC CMS HI ? Core Software Team Operations Team Project Coordination Group Applications Team GLUE Interoperability Team Outreach Team iVDGL Management & Coordination Paul Avery, University of Florida avery@phys.ufl.edu
Meetings in 2000-2001 • GriPhyN/iVDGL meetings • Oct. 2000 All-hands Chicago • Dec. 2000 Architecture Chicago • Apr. 2001 All-hands, EAC USC/ISI • Aug. 2001 Planning Chicago • Oct. 2001 All-hands, iVDGL USC/ISI • Numerous smaller meetings • CS-experiment • CS research • Liaisons with PPDG and EU DataGrid • US-CMS and US-ATLAS computing reviews • Experiment meetings at CERN Paul Avery, University of Florida avery@phys.ufl.edu
Meetings in 2002 • GriPhyN/iVDGL meetings • Jan. 2002 EAC, Planning, iVDGL Florida • Mar. 2002 Outreach Workshop Brownsville • Apr. 2002 All-hands Argonne • Jul. 2002 Reliability Workshop ISI • Oct. 2002 Provenance Workshop Argonne • Dec. 2002 Troubleshooting Workshop Chicago • Dec. 2002 All-hands technical ISI + Caltech • Jan. 2003 EAC SDSC • Numerous other 2002 meetings • iVDGL facilities workshop (BNL) • Grid activities at CMS, ATLAS meetings • Several computing reviews for US-CMS, US-ATLAS • Demos at IST2002, SC2002 • Meetings with LCG (LHC Computing Grid) project • HEP coordination meetings (HICB) Paul Avery, University of Florida avery@phys.ufl.edu
Global Context: Data Grid Projects • U.S. Infrastructure Projects • GriPhyN (NSF) • iVDGL (NSF) • Particle Physics Data Grid (DOE) • TeraGrid (NSF) • DOE Science Grid (DOE) • EU, Asia major projects • European Data Grid (EDG) (EU, EC) • EDG related national Projects (UK, Italy, France, …) • CrossGrid (EU, EC) • DataTAG (EU, EC) • LHC Computing Grid (LCG) (CERN) • Japanese Project • Korea project Paul Avery, University of Florida avery@phys.ufl.edu
U.S. Project Coordination: Trillium • Trillium = GriPhyN + iVDGL+ PPDG • Large overlap in leadership, people, experiments • Benefit of coordination • Common S/W base + packaging: VDT + PACMAN • Low overhead for collaborative or joint projects:security, monitoring, newsletter, prod. grids, demos • Wide deployment of new technologies, e.g. Virtual Data • Stronger, more extensive outreach effort • Forum for US Grid projects • Joint view, strategies, meetings and work • Unified entity to deal with EU & other Grid projects • “Natural” collaboration across DOE and NSF projects • Funding agency interest? Paul Avery, University of Florida avery@phys.ufl.edu
International Grid Coordination • Close collaboration with EU DataGrid (EDG) • Many connections with EDG activities • HICB: HEP Inter-Grid Coordination Board • Non-competitive forum, strategic issues, consensus • Cross-project policies, procedures and technology • International joint projects • HICB-JTB Joint Technical Board • Definition, oversight and tracking of joint projects • GLUE interoperability group • Participation in LHC Computing Grid (LCG) • Software Computing Committee (SC2) • Project Execution Board (PEB) • Grid Deployment Board (GDB) Paul Avery, University of Florida avery@phys.ufl.edu
Creation of WorldGrid • Joint US-EU Grid deployment • GriPhyN contribution: VDT • WorldGrid is major driver for VDT • Demonstrated at IST2002 (Copenhagen) • Demonstrated at SC2002 (Baltimore) • Becoming major outreach tool in 2003 • Meeting in February to continue development Paul Avery, University of Florida avery@phys.ufl.edu
WorldGrid Sites Paul Avery, University of Florida avery@phys.ufl.edu
What Coordination Takes Paul Avery, University of Florida avery@phys.ufl.edu
Extending GriPhyN’s Reach • Dynamic workspaces proposal • Expansion of virtual data technologies to global analysis communities • FIU: Creation of “CHEPREO” in Miami area • HEP research, participation in WorldGrid • Strong minority E/O, coordinate with GriPhyN/iVDGL • Research & int’l network: Brazil / South America • Also, MRI, SciDAC, other proposals Paul Avery, University of Florida avery@phys.ufl.edu
Summary • CS research • Unified approach based around Virtual Data • Virtual Data, Planning, Execution, Monitoring • Education/Outreach • Student & postdoc involvement at all levels • New collaborations with other E/O efforts, WorldGrid • Organization and management • Clear management coordinating CS + experiments • Collaboration/coordination US and international • Research dissemination, broad impact • Wide deployment of VDT (US, WorldGrid, EDG, LCG) • Demo projects, experiment testbeds, major productions • New projects extending virtual data technologies Paul Avery, University of Florida avery@phys.ufl.edu
Grid References • Grid Book • www.mkp.com/grids • GriPhyN • www.griphyn.org • iVDGL • www.ivdgl.org • PPDG • www.ppdg.net • TeraGrid • www.teragrid.org • Globus • www.globus.org • Global Grid Forum • www.gridforum.org • EU DataGrid • www.eu-datagrid.org Paul Avery, University of Florida avery@phys.ufl.edu