300 likes | 471 Views
TeraGrid Science Gateways. Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu. Today I hope to answer. What are gateways? Why are gateways worth the effort
E N D
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu NSF Program Officers, September 10, 2008
Today I hope to answer • What are gateways? • Why are gateways worth the effort • What do they allow scientists to do that they couldn't without gateways? What are some specific examples of this? Why are these examples important? • Impact on education and workforce development • Why sustainable gateways are important • We’ll demonstrate these with individual examples NSF Program Officers, September 10, 2008
May, 2007 Gateway presentation at the NSFHow many of you were here? • 4 hour recap in two slides • Web developments, explosion of digital data are leading to the increased importance of gateways • 16 years after the availability of Mosaic, full impact on science yet to be felt • Many studies point to the impact of the internet on science • Public perception of the value of science increases with their use of science-based websites • Web usage model resonates with scientists • But, need persistency if the Web is to have a profound impact on science NSF Program Officers, September 10, 2008
NSF has a long history in combining science and technology • PACI, ITR, STCs • Leadership continues today • 5 great presentations • Gerhard Klimeck, Purdue, nanoHUB • Dennis Gannon, Indiana University, LEAD • Sudhakar Pamidighantam, UIUC, GridChem • John McGee, RENCI, TeraGrid Bioportal • Shaowen Wang, UIUC, GISolve NSF Program Officers, September 10, 2008
Today, there are approximately 29 gateways using the TeraGrid NSF Program Officers, September 10, 2008
Does a gateway have to use TeraGrid to be a gateway? • No, I just talk about those that do because of my funding • But my position exposes me to a variety of gateways, many • Using high end resources is more work and is not recommended unless it serves a demonstrated need • Gateways are an excellent way to extend the impact of high-end resources • Are they all funded by TeraGrid? • Can TeraGrid claim success for all gateways? • No, we don’t make gateways the gateways you use, we make the gateways you use better NSF Program Officers, September 10, 2008
Tremendous Opportunities Using the Largest Shared Resources - Challenges too! • What’s different when the resource doesn’t belong just to me? • Resource discovery • Accounting • Security • Proposal-based requests for resources (peer-reviewed access) • Code scaling and performance numbers • Justification of resources • Gateway citations • Tremendous benefits at the high end, but even more work for the developers • Potential impact on science is huge • Small number of developers can impact thousands of scientists • But need a way to train and fund those developers and provide them with appropriate tools NSF Program Officers, September 10, 2008
Why are gateways worth the effort? ======= # Full path to executable executable=/users/wilkinsn/tutorial/bin/mcell # Working directory, where Condor-G will write # its output and error files on the local machine. initialdir=/users/wilkinsn/tutorial/exercise_3 # To set the working directory of the remote job, we # specify it in this globus RSL, which will be appended # to the RSL that Condor-G generates globusrsl=(directory='/users/wilkinsn/tutorial/exercise_3') # Arguments to pass to executable. arguments=nmj_recon.main.mdl # Condor-G can stage the executable transfer_executable=false # Specify the globus resource to execute the job globusscheduler=tg-login1.sdsc.teragrid.org/jobmanager-pbs # Condor has multiple universes, but Condor-G always uses globus universe=globus # Files to receive sdout and stderr. output=condor.out error=condor.err # Specify the number of copies of the job to submit to the condor queue. queue 1 • Increasing range of expertise needed to tackle the most challenging scientific problems • How many details do you want each individual scientist to need to know? • PBS, RSL, Condor • Coupling multi-scale codes • Assembling data from multiple sources • Collaboration frameworks #! /bin/sh #PBS -q dque #PBS -l nodes=1:ppn=2 #PBS -l walltime=00:02:00 #PBS -o pbs.out #PBS -e pbs.err #PBS -V cd /users/wilkinsn/tutorial/exercise_3 ../bin/mcell nmj_recon.main.mdl +( &(resourceManagerContact="tg-login1.sdsc.teragrid.org/jobmanager-pbs") (executable="/users/birnbaum/tutorial/bin/mcell") (arguments=nmj_recon.main.mdl) (count=128) (hostCount=10) (maxtime=2) (directory="/users/birnbaum/tutorial/exercise_3") (stdout="/users/birnbaum/tutorial/exercise_3/globus.out") (stderr="/users/birnbaum/tutorial/exercise_3/globus.err") ) NSF Program Officers, September 10, 2008
Not just ease of useWhat can scientists do that they couldn’t do previously? • LEAD - access to radar data • NVO – access to sky surveys • OOI – access to sensor data • PolarGrid – access to polar ice sheet data • SIDGrid – analysis tools • GridChem – developing multiscale coupling • How would this have been done before gateways? NSF Program Officers, September 10, 2008
Gateways can further investments in other projects • Increase access • To instruments, we’ll see an example today • Increase capabilities • To analyze data, we’ll see an example today • Improve workforce development • For underserved populations, we’ll see an example today • Increase outreach • Increase public awareness • Public sees value in investments in large facilities • Slice bread • Pack the kids’ lunch, etc. NSF Program Officers, September 10, 2008
Gateways in the marketplaceKids control telescopes and share images • “In seconds my computer screen was transformed into a live telescopic view” • “Slooh's users include newbies and professional astronomers in 70 countries” • Observatories in the Canary Islands and Chile, Australia coming soon • 5000 images/month since 2003 • Increases public support for investment in these facilities NSF Program Officers, September 10, 2008
Gateways Greatly Expand Access • Almost anyone can investigate scientific questions using high end resources • Not just those in the research groups of those who request allocations • Gateways allow anyone with a web browser to explore • Opportunities can be uncovered via google • My 11-year-old son discovered nanoHUB.org himself while his class was studying Bucky Balls • Fosters new ideas, cross-disciplinary approaches • Encourages students to experiment • But used in production too • Significant number of papers resulting from gateways including GridChem, nanoHUB • Scientists can focus on challenging science problems rather than challenging infrastructure problems NSF Program Officers, September 10, 2008
TeraGrid Pathways Activities • 2 Gateway components • Adapt gateways for educational use by underrepresented communities • GEON – SDSC, Navajo Tech • Teach participants from underrepresented communities how to build gateways • PolarGrid – IU, ECSU NSF Program Officers, September 10, 2008
Navajo Technical College and gateways • Incorporating the use of gateways in their curricula • GEON, GISolve areas of initial interest NSF Program Officers, September 10, 2008
PolarGrid • Cyberinfrastructure Center for Polar Science (CICPS) • Experts in polar science, remote sensing and cyberinfrastructure • Indiana, ECSU, CReSIS • Satellite observations show disintegration of ice shelves in West Antarctica and speed-up of several glaciers in southern Greenland • Most existing ice sheet models, including those used by IPCC cannot explain the rapid changes http://www.polargrid.org/polargrid/images/4/42/C0050-polargrid-big.m4v Source: Geoffrey Fox NSF Program Officers, September 10, 2008
Components of PolarGrid • Expedition grid consisting of ruggedized laptops in a field grid linked to a low power multi-core base camp cluster • Prototype and two production expedition grids feed into a 17 Teraflops "lower 48" system at Indiana University and Elizabeth City State (ECSU) split between research, education and training. • Gives ECSU a top-ranked 5 Teraflop MSI high performance computing system • Access to expensive data • High-end resources for analysis • MSI student involvement Source: Geoffrey Fox NSF Program Officers, September 10, 2008
Recent Gateways using TeraGrid Significantly • SCEC • SIDGrid • CIG NSF Program Officers, September 10, 2008
SCEC using gateway to produce hazard map • PSHA hazard map for California using newly released Earthquake Rupture Forecast (UCERF2.0) calculated using SCEC Science Gateway • Warm colors indicate regions with a high probability of experiencing strong ground motion in the next 50 years. • High resolution map, significant CPU use NSF Program Officers, September 10, 2008
Social Informatics Data Grid • Heavy use of “multimodal” data. • Subject might be viewing a video, while a researcher collects heart rate and eye movement data. • Events must be synchronized for analysis, large datasets result • Extensive analysis capabilities are not something that each researcher should have to create for themselves. http://www.ci.uchicago.edu/research/files/sidgrid.mov NSF Program Officers, September 10, 2008
Social scientists have traditionally worked in isolated labs without the capability to share data or insights with others. • SIDGrid enables a number of capabilities. • Data that is expensive to collect can now be shared with others, increasing the potential for scientific impact. • Geographically distant researchers can collaborate on the analysis of the same data set. • Complex analysis tools and workflows are now available for all to use, rather than having each lab duplicate efforts. • All researchers now have access to the highest quality computational resources • SIDGrid uses TeraGrid resources for computationally-intensive tasks such as media transcoding algorithms for pitch analysis of audio tracks and fMRI image analysis • SIDGrid is unique among social science data archive projects • Focused on streaming data which change over time • Provides the ability to investigate multiple datasets, collected at different time scales, simultaneously • Active users of the SIDGrid system include a human neuroscience group and linguistic research groups from the University of Chicago and the University of Nottingham, UK NSF Program Officers, September 10, 2008
40 institutional members • 9 foreign affiliates • Researchers request synthetic seismograms for any given earthquake • Allows scientists to understand the ground motion associated with any given earthquake • Requested and received advanced support from TeraGrid NSF Program Officers, September 10, 2008
Advanced support for OCI resourcesIncluding gateway integration • Same peer review process used to request resources • 30,000 CPUs • + 6 months of Nancy • Reviews based on appropriate use of resources, science is not reviewed if already funded • Petascale • Multisite workflows • Gateways • Domain expertise Or someone really talented NSF Program Officers, September 10, 2008
Support is Very Targeted • Start with well-defined objectives • Focus on efficient or novel use of OCI resources • Minimum .25 FTE for months to a year • Enough investment to really understand and help solve complex problems • Must have commitment from PIs • Want to make sure work is incorporated into production codes and gateways • Good candidates for targeted support include: • Large, high impact projects • Ability to influence new communities • Happy for feedback from directorates on important projects • Lessons learned move into training and documentation NSF Program Officers, September 10, 2008
Gateway white paper recommends sustained funding • Gateways can be used for the most challenging problems, but • Scientists won’t rely on something that they are not confident will be around for the duration • We see this with software, but even more so with gateway infrastructure • A sustained gateway program can • Reduce duplication of effort • Sporadic development with many small programs • Increase diversity of end users • Increase skill set diversity of developers • Bring together teams to address the toughest problems NSF Program Officers, September 10, 2008
Recommend 10-year programwith interim reviews • Characteristics of 5-year or less cycles • Build exciting prototypes with input from scientists • Work with early adopters to extend capabilities • Tools are publicized, more scientists interested • Funding ends • Scientists who invested their time to use new tools are disillusioned • Less likely to try something new again • Start again on new short-term project • Need to break this cycle NSF Program Officers, September 10, 2008
Begin with user-driven workshops • What are the most fundamental capabilities in each directorate? • What is the next PDB? nanoHUB? Earth System Grid? • What is the community calling for? • Curated data collections • Which collections? • Simulation, visualization and analysis • Collaboration tools or workspaces • Generation of complex workflows • Access to instruments, sensor or radar data that have limited exposure today • Merit review and assessment will be critical to a long-term program NSF Program Officers, September 10, 2008
When might a gateway be appropriate? • Researchers using defined sets of tools in different ways • Same executables, different input • GridChem, CHARMM • Creating multi-scale or complex workflows • Datasets • Common data formats • National Virtual Observatory • Earth System Grid • Some groups have invested significant efforts here • caBIG, extensive discussions to develop common terminology and formats • BIRN, extensive data sharing agreements • Difficult to access data/advanced workflows • Sensor/radar input • LEAD, GEON NSF Program Officers, September 10, 2008
Tremendous Potential for Gateways • In only 16 years, the Web has fundamentally changed human communication • Science Gateways can leverage this amazingly powerful tool to: • Transform the way scientists collaborate • Streamline conduct of science • Influence the public’s perception of science • Reliability, trust, continuity are fundamental to truly change the conduct of science through the use of gateways • High end resources can have a profound impact • The future is very exciting! NSF Program Officers, September 10, 2008
Thank you for your attention • For more information • www.teragrid.org • wilkinsn@sdsc.edu • Live demonstration of the Neutron Science Gateway • Vickie Lynch, Oak Ridge National Laboratory NSF Program Officers, September 10, 2008
Afternoon Agenda • 2:00 pm Break • (aka recover from this talk, ask questions) • 2:15 pm Track 2 Resources • 2:15-2:35 pm Ranger • Jay Boisseau, Texas Advanced Computing Center • 2:35-2:55 pm Kraken • Bruce Loftis, National Institute for Computational Sciences • 2:55-3:15 pm Track 2c • Nick Nystrom, Pittsburgh Supercomputing Center • 3:15 pm Blue Waters • John Towns, National Center for Supercomputing Applications • 3:30 pm Open discussion with all presenters NSF Program Officers, September 10, 2008