340 likes | 427 Views
TeraGrid A National Production Cyberinfrastructure Facility. Scott Lathrop TeraGrid Director of Education, Outreach and Training University of Chicago and Argonne National Laboratory lathrop@mcs.anl.gov www.teragrid.org. TeraGrid: Integrating NSF Cyberinfrastructure. Buffalo. Wisc.
E N D
TeraGridA National ProductionCyberinfrastructure Facility Scott Lathrop TeraGrid Director of Education, Outreach and Training University of Chicago and Argonne National Laboratory lathrop@mcs.anl.gov www.teragrid.org
TeraGrid: Integrating NSF Cyberinfrastructure Buffalo Wisc Cornell Utah Iowa Caltech USC-ISI UNC-RENCI UC/ANL PU NCAR PSC IU NCSA ORNL SDSC TACC TeraGrid is a facility that integrates computational, information, and analysis resources at the San Diego Supercomputer Center, the Texas Advanced Computing Center, the University of Chicago / Argonne National Laboratory, the National Center for Supercomputing Applications, Purdue University, Indiana University, Oak Ridge National Laboratory, the Pittsburgh Supercomputing Center, and the National Center for Atmospheric Research.
TeraGrid Vision • TeraGrid will create integrated, persistent, and pioneering computational resources that will significantly improve our nation’s ability and capacity to gain new insights into our most challenging research questions and societal problems. • Our vision requires an integrated approach to the scientific workflow including obtaining access, application development and execution, data analysis and management, and collaboration.
TeraGrid Objectives • DEEP Science: Enabling Petascale Science • Make Science More Productive through an integrated set of very-high capability resources • Address key challenges prioritized by users (tera - 10^12; peta - 10^15) • WIDE Impact: Empowering Communities • Bring TeraGrid capabilities to the broad science community • Partner with science community leaders and educators • OPEN Infrastructure, OPEN Partnership • Provide a coordinated, general purpose, reliable set of services and resources • Partner with campuses and grids
TeraGrid DEEP SCIENCE Objectives Enabling Petascale Science • Make Science More Productive through an integrated set of very-high capability resources • Address key challenges prioritized by users • Ease of Use: TeraGrid User Portal • Significant and deep documentation and training improvements • Addresses user tasks related to allocations, accounts • Breakthroughs: Advanced Support for TeraGrid Applications (ASTA) • Hands-on, “Embedded” consultant to help teams bridge a gap • Seven user teams have been helped • Eight user teams currently receiving assistance • Five proposed projects with new user teams • New Capabilities driven by user surveys • WAN Parallel File System for remote I/O (move data only once!) • Enhanced workflow tools (added GridShell, VDS)
TeraGrid Resources Over 100 TeraFlops in Computing Resources
TeraGrid Usage PACI Systems 33% Annual Growth
TeraGrid PI’s By Institution as of May 2006 Blue: 10 or more PI’s Red: 5-9 PI’s Yellow: 2-4 PI’s Green: 1 PI TeraGrid PI’s
TeraGrid User Community The Development Allocations Committee (DAC) accepts requests to develop applications, experiment with TeraGrid platforms, or use TeraGrid systems for classroom instruction. 160 DAC proposals in FY06 continues strong growth in new users investigating the use of TeraGrid for their science.
Ease of Use: TeraGrid User Portal Account Management Manage my allocation(s) Manage my credentials Manage my project users Information Services TeraGrid resources & attributes job queues load and status information Documentation User Info documentation contextual help for interfaces Consulting Services help desk information portal feedback channel Allocation Services How to apply for allocations Allocation request/renewal Eric Roberts (ericrobe@tacc.utexas.edu)
Modeling Information Processing and Public Opinion How do people assess political candidates? How do campaign events and new information change their views? Sung-youn Kim, University of Iowa, and Milton Lodge and Charles Taber, Stony Brook University, New York, saw gaps between existing models and empirical findings integrated both cognitive and affective information-processing theories into a computational model that simulates how voters’ political opinions fluctuate during a campaign. Using data from the 2000 National Annenberg Election Survey (NAES), they constructed virtual voters, each with a unique mindset. Campaign messages were gleaned from news accounts and reduced to simple sentences (e.g. "Bush said Gore is dishonest" or “Gore said Bush is anti-abortion”). The computational model parses each sentence, retrieves relevant concepts from the long-term memory of each agent, and updates each agent's knowledge and attitudes accordingly. Using 100 agents, this simulation was repeated 100 times, returning results that accord well with the actual 2000 polling data. Because of this complexity and the sheer computational intensity of the simulation, the researchers relied on the computational power of the TeraGrid, employing systems at SDSC and NCSA.
Social and Behavioral Science Gateway Understanding neural, cognitive, and social behaviors can depend on the ability to uncover coherent patterns among disparate data collected at different times and places. Researchers correlate subjects’ videotaped reactions with eye movements, heart rates, electroencephalogram results, and answers to written surveys.“It’s complicated,” says Chicago psychologist Bennett Bertenthal, and technological infrastructure has not kept pace with research needs.Bertenthal and his team are working to develop a set of cybertools to help collect and analyze behavioral data, collaborating with the TeraGrid Science Gateways Program. With NSF support, the Social Informatics Data (SID) Grid is a vast and sophisticated warehouse of readings and measurements meant to foster collaboration and to spur the development of standards for gathering and coding data—“which,” Bertenthal says, “right now is not part of the landscape.” In two years, when the SID Grid is complete, psychologists, sociologists, anthropologists, economists, and neuroscientists can share notes across the globe and use software interfaces like the ones shown above to synthesize numerous forms of streaming data at once. Adapted from University of Chicago Magazine
Magnetic NanocompositesWang (PSC) • Direct quantum mechanical simulation on Cray XT3. • Goal: nano-structured material with potential applications in high density data storage: 1 particle/bit. • Need to understand influence of these nanoparticles on each other. • A petaflop machine would enable realistic simulations for nanostructures of ~ 50nm (~ 5M atoms). • LSMS- locally self-consistent multiple scattering method is a linear scaling ab initio electronic structure method (Gordin Bell prize winner) • Achieves as high as 81% peak performance of CRAY-XT3 Wang (PSC), Stocks, Rusanu, Nicholson, Eisenbach (ORNL), Faulkner (FAU)
VORTONICSBoghosian (Tufts) • Physical challenges: Reconnection and Dynamos • Vortical reconnection governs establishment of steady-state in Navier-Stokes turbulence • Magnetic reconnection governs heating of solar corona • The astrophysical dynamo problem. Exact mechanism and space/time scales unknown and represent important theoretical challenges • Computational challenges: Enormous problem sizes, memory requirements, and long run times • requires relaxation on space-time lattice of 5-15 Terabytes. • Requires geographically distributed domain decomposition (GD3): DTF, TCS, Lonestar • Real time visualization at UC/ANL • Insley (UC/ANL), O’Neal (PSC), Guiang (TACC) Homogeneous turbulence driven by force of Arnold-Beltrami-Childress (ABC) form
Largest and most detailed earthquake simulation of the southern San Andreas fault. First calculation of physics-based probabilistic hazard curves for Southern California using full waveform modeling rather than traditional attenuation relationships. Computation and data analysis at multiple TeraGrid sites. Workflow tools enable work at a scale previously unattainable by automating the very large number of programs and files that must be managed. TeraGrid staff Cui (SDSC), Reddy (GIG/PSC) TeraShake / CyberShakeOlsen (SDSU), Okaya (USC) Major Earthquakes on the San Andreas Fault, 1680-present 1906 M 7.8 1857 M 7.8 1680 M 7.7 Simulation of a magnitude 7.7 seismic wave propagation on the San Andreas Fault. 47 TB data set.
Searching for New Crystal StructuresDeem (Rice) • Searching for new 3-D zeolite crystal structures in crystallographic space • Requires 10,000s of serial jobs through TeraGrid. • Using MyCluster/GridShell to aggregate all the computational capacity on the TeraGrid for accelerating search. • TG staff Walker (TACC) and Cheeseman (Purdue)
TeraGrid WIDE IMPACT Objectives Empowering Communities • Bring TeraGrid capabilities to the broad science community • Partner with science community leaders - “Science Gateways” • Science Gateways Program • Originally ten partners, now 21 and growing • Reaching over 100 Gateway partner institutions (PIs) • Anticipating order of magnitude increase in users via Gateways • Education, Outreach, and Training • National collaborations integrating TeraGrid resources
TeraGrid Science Gateways Initiative:Community Interface to Grids TeraGrid Grid-X Grid-Y • Common Web Portal or application interfaces or grid (database access, computation, workflow, etc). • “Back-End” use of TeraGrid computation, information management, visualization, or other services. • Standard approaches so science gateways may readily access resources in any cooperating Grid without technical modification.
Science Gateway Partners Open Science Grid (OSG) Special PRiority and Urgent Computing Environment (SPRUCE, UChicago) National Virtual Observatory (NVO, Caltech) Linked Environments for Atmospheric Discovery (LEAD, Indiana) Computational Chemistry Grid (GridChem, NCSA) Computational Science and Engineering Online (CSE-Online, Utah) GEON(GEOsciences Network) (GEON, SDSC) Network for Earthquake Engineering Simulation (NEES, SDSC) SCEC Earthworks Project (USC) Astrophysical Data Repository (Cornell) CCR ACDC Portal (Buffalo) Network for Computational Nanotechnology and nanoHUB (Purdue) GIScience Gateway (GISolve, Iowa) Biology and Biomedicine Science Gateway (UNC RENCI) Open Life Sciences Gateway (OLSG, UChicago) The Telescience Project (UCSD) Grid Analysis Environment (GAE, Caltech) Neutron Science Instrument Gateway (ORNL) TeraGrid Visualization Gateway (ANL) BIRN (UCSD) Gridblast Bioinformatics Gateway (NCSA) Earth Systems Grid (NCAR) SID Grid (UChicago)
TeraGrid Science Gateway Partner Sites TG-SGW-Partners 21 Science Gateway Partners (and growing) - Over 100 partner Institutions
TeraGrid Education, Outreach and Training The mission is to engage larger and more diverse communities of researchers, educators and learners in discovering, using, and contributing to TeraGrid. The goals are to: • Enable awareness and broader community access to TeraGrid • Promote diversity among all activities • Foster partnerships to sustain and scale-up best practices
K-12 Outreach • Computational science workshops for K-12 teachers, and pre-service students • GEMSto engage young girls in math and science • Summerworkshops for students- hands-on science • SC06 Education Programsengage teachers and faculty • SDSC TeacherTECHengages K-12 teachers and students with sustained interaction • SDSC Data Portalincorporates scientific data into the curricula
Higher Education Outreach • Summer Grid workshop for undergraduate students • Computational science workshops- faculty and students • E.g. Computational chemistry, computational biology, etc. • MSI workshop - campus infrastructure • DevelopingBioinformaticsPrograms • SC07-09 Education Programsengage faculty and students • Science Gateways in education • LEAD science gatewayeducation portal • nanoHubscience gatewayused in many campus courses • HPC workshops • Research experiences for undergraduates (REUs) • On-line tutorials - CI topics and college courses
Community Outreach • CIP seminars, CI-Channel—live webcasts and recorded sessions • National conferences —Grace HopperTeraGrid panel, SCxx annual conferences, etc. • Science Impact stories —via web site, press releases, brochures • TeraGrid Speaker’s Bureau —conferences, workshops, meetings • Katrina: After the Storm —Civic Engagement Through Arts, Humanities and Technology, part of HASTAC’s Information Year 2006-2007 • Annual TeraGrid Conference —planning underway for 2007 in Washington, DC.
SC07-09 Education Program Goals • Three-year (SC|07-09) Education Program to provide continuity and broader, sustained impact in education • Increase participation of larger, more diverse communities in the SC Conference • Integrate HPC into undergraduate science, technology, engineering and mathematics classrooms • Recruiting Institutions NOW!
SC07-09 Education Program Year-round Activities • Attend annual SC Conference • Week-long summer workshops distributed around the country • Regular visits to institutions for workshops and working with administrators • Mentoring of faculty and students • Course materials development • Posting of materials to ACM and NSF-NSDL digital libraries • Committees to plan and organize events
TeraGrid OPEN ObjectivesInfrastructure and Partnerships • Provide a coordinated, general purpose, reliable set of services and resources • Partner with grids and facilities • Streamlined Software Integration • Evolved architecture to leverage standards, web services • Campus Partnership Programs • User access, physical and digital asset federation, outreach
TeraGrid “Open” Initiatives • Working with Campuses: toward Integrated Cyberinfrastructure • Access for Users: Authentication and Authorization • Additional Capacity: Integrated resources • Additional Services: Integrated data collections • Broadening Participation: Affiliates program with diverse institutions • Training for faculty, staff, and students • Technology Foundations • Security, Authentication, and Accounting • Service-based Software Architecture
Lower Integration Barriers; Improved Scaling • Initial Integration: Implementation-based • Coordinated TeraGrid Software and Services (CTSS) • Provide software for heterogeneous systems, leverage specific implementations to achieve interoperation. • Evolving understanding of “minimum” required software set for users • Emerging Architecture: Services-based • Core services: capabilities that define a “TeraGrid Resource” • Authentication & Authorization Capability • Information Service • Auditing/Accounting/Usage Reporting Capability • Verification & Validation Mechanism • Significantly smaller than the current set of required components. • Provides a foundation for value-added services. • Each Resource Provider selects one or more added services, or “kits” • Core and individual kits can evolve incrementally, in parallel
Example Value-Added Service Kits Job Execution Application Development Science Gateway Hosting Application Hosting dynamic service deployment Data Movement Data Management Science Workflow Support Visualization
Data Collections Instruments & Sensors Science/Education Portal Colleagues Data Collections
Cyberinfrastructure User Advisory Committee Patricia Teller University of Texas - El Paso Gwen Jacobs Montana State University Thomas Cheatham University of Utah Luis Lehner Louisiana State University Roy Pea Stanford University Cathy Wu Georgetown University Philip Maechling University of Southern California Gerhard Klimeck Purdue University Bennett Bertenthal University of Chicago PK Yeung Georgia Institute of Technology Alex Ramirez Hispanic Association of Colleges and Universities Nora Sabelli Center for Innovative Learning Technologies
TeraGrid: More Information www.TeraGrid.org