220 likes | 309 Views
An Introduction to the. Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu. National Science Foundation TeraGrid. The world’s largest collection of supercomputers. Pittsburgh Supercomputing Center. Founded in 1986
E N D
An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu
National Science Foundation TeraGrid The world’s largest collection of supercomputers CIG MCW, Boulder, CO
Pittsburgh Supercomputing Center • Founded in 1986 • Joint venture between Carnegie Mellon University, University of Pittsburgh, and Westinghouse Electric Co. • Funded by several federal agencies as well as private industries. • Main source of support is National Science Foundation CIG MCW, Boulder, CO
Pittsburgh Supercomputing Center • PSC is the third largest NSF sponsored supercomputing center • BUT last year we provided over 60% of the computer time used by the NSF research • AND PSC most recently had the most powerful supercomputer in the world(for unclassified research) CIG MCW, Boulder, CO
Pittsburgh Supercomputing Center • SCALE: • 3000 processors • SIZE: • 1 basketball court • COMPUTING POWER: • 6 TeraFlops (6 trillion floating point operations per second) • Will do in 3 hours what a PC will do in a year The Terascale Computing System (TCS) at the Pittsburgh Supercomputing Center Upon entering production in October 2001, the TCS was the most powerful computer in the world for unclassified research CIG MCW, Boulder, CO
Pittsburgh Supercomputing Center • HEAT GENERATED: • 2.5 million BTUs • (169 lbs of coal per hour) • AIR CONDITIONING: • 900 gallons of water per minute • (375 room air conditioners) • BOOT TIME: • ~3 hours The Terascale Computing System (TCS) at the Pittsburgh Supercomputing Center Upon entering production in October 2001, the TCS was the most powerful computer in the world for unclassified research CIG MCW, Boulder, CO
Pittsburgh Supercomputing Center CIG MCW, Boulder, CO
NCSA: National Center for Super-computing Applications • SCALE: • 1774 processors • ARCHITECHTURE: • Intel Itanium2 • COMPUTING POWER: • 10 TeraFlops The TeraGrid cluster at NCSA CIG MCW, Boulder, CO
TACC:Texas Advanced Computing Center • SCALE: • 1024 processors • ARCHITECHTURE: • Intel Xeon • COMPUTING POWER: • 6 TeraFlops The TeraGrid cluster “LoneStar” at TACC CIG MCW, Boulder, CO
Before the TeraGrid:Supercomputing “The Old Fashioned way” • Each supercomputer center was it’s own independent entity. • Users applied for time at a specific supercomputer center • Each center supplied its own: • compute resources • archival resources • accounting • user support CIG MCW, Boulder, CO
Creating a unified user environment… Single user support resources. Single authentication point Common software functionality Common job management infrastructure Globally-accessible data storage …across heterogeneous resources 7+ computing architectures 5+ visualization resources diverse storage technologies Create a unified national HPC infrastructure that is both heterogeneous and extensible The TeraGrid Strategy CIG MCW, Boulder, CO
The TeraGrid Strategy TeraGrid Resource Partners • A major paradigm shift for HPC resource providers • Make NSF resources useful to a wider community Strength through uniformity! Strength through diversity! CIG MCW, Boulder, CO
TeraGrid Components • Compute hardware • Intel/Linux Clusters • Alpha SMP clusters • IBM POWER3 and POWER4 clusters • SGI Altix SMPs • SUN visualization systems • Cray XT3 (PSC July 20) • IBM Blue Gene/L (SDSC Oct 1) CIG MCW, Boulder, CO
TeraGrid Components • Large-scale storage systems • hundreds of terabytes for secondary storage • Very high-speed network backbone (40Gb/s) • bandwidth for rich interaction and tight coupling • Grid middleware • Globus, data management, … • Next-generation applications CIG MCW, Boulder, CO
Building a System of Unprecidented Scale • 40+ teraflops compute • 1+ petabyte online storage • 10-40Gb/s networking CIG MCW, Boulder, CO
TeraGrid Resources CIG MCW, Boulder, CO
“Grid-Like” Usage ScenariosCurrently Enabled by the TeraGrid • “Traditional” massively parallel jobs • Tightly-coupled interprocessor communication • storing vast amounts of data remotely • remote visualization • Thousands of independent jobs • Automatically scheduled amongst many TeraGrid machines • Use data from a distributed data collection • Multi-site parallel jobs • Compute upon many TeraGrid sites simultaneously TeraGrid is working to enable more! CIG MCW, Boulder, CO
Allocations Policies • Any US researcher can request an allocation • Policies/procedures posted at: • http://www.paci.org/Allocations.html • Online proposal submission • https://pops-submit.paci.org/ CIG MCW, Boulder, CO
Allocations Policies • Different levels of review for different size allocations • DAC: “Development Allocation Committee” • up to30,000Service Units (“SUs”, 1 SU =~ 1 CPU Hour) • only a one paragraph abstract required • Must focus on developing an MRAC or NRAC application • accepted continuously! • MRAC: “Medium Resource Allocation Committee” • <200,000 SUs/year • reviewed every 3 months • next deadline July 15, 2005 (then October 21) • NRAC: “National Resource Allocation Committee” • >200,000 SUs/year • reviewed every 6 months • next deadline July 15, 2005 (then January 2006) CIG MCW, Boulder, CO
Accounts and Account Management • Once a project is approved, the PI can add any number of users by filling out a simple online form • User account creation usually takes 2-3 weeks • TG accounts created on ALL TG systems for every user • single US mail packet arriving for user • accounts and usage synched through centralized database CIG MCW, Boulder, CO
Roaming and Specific Allocations • R-Type: “roaming” allocations • can be used on any TG resource • usage debited to a single (global) allocation of resource maintained in a central database • S-Type: “specific” allocations • can only be used on specified resource • (All S-only awards come with 30,000 roaming SUs to encourage roaming usage of TG) CIG MCW, Boulder, CO
Useful links • TeraGrid website • http://www.teragrid.org • Policies/procedures posted at: • http://www.paci.org/Allocations.html • TeraGrid user information overview • http://www.teragrid.org/userinfo/index.html • Summary of TG Resources • http://www.teragrid.org/userinfo/guide_hardware_table.html • Summary of machines with links to site-specific user guides (just click on the name of each site) • http://www.teragrid.org/userinfo/guide_hardware_specs.html • Email: help@teragrid.org CIG MCW, Boulder, CO