230 likes | 407 Views
TeraGrid Science Gateways in Barcelona. Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu. A technical talk about gateways No videos!.
E N D
TeraGrid Science Gatewaysin Barcelona Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu EGEE 2009, September 21-25, 2009
A technical talk about gateways • No videos! EGEE 2009, September 21-25, 2009
How did the Gateway program develop?A natural result of the impact of the internet on worldwide communication and information retrieval • Implications on the conduct of science are still evolving • 1980’s, Early gateways, National Center for Biotechnology Information BLAST server, search results sent by email, still a working portal today • 1992 Mosaic web browser developed • 1995 “International Protein Data Bank Enhanced by Computer Browser” • 2004 TeraGrid project director Rick Stevens recognized growth in scientific portal development and proposed the Science Gateway Program • Today, Web 3.0 and programmatic exchange of data between web pages • Simultaneous explosion of digital information • Growing analysis needs in many, many scientific areas • Sensors, telescopes, satellites, digital images and video, • #1 machine on Top500 today is 1000x more powerful than all combined entries on the first list in 1993 Only 17 years since the release of Mosaic! EGEE 2009, September 21-25, 2009
Why are gateways worth the effort? ======= # Full path to executable executable=/users/wilkinsn/tutorial/bin/mcell # Working directory, where Condor-G will write # its output and error files on the local machine. initialdir=/users/wilkinsn/tutorial/exercise_3 # To set the working directory of the remote job, we # specify it in this globus RSL, which will be appended # to the RSL that Condor-G generates globusrsl=(directory='/users/wilkinsn/tutorial/exercise_3') # Arguments to pass to executable. arguments=nmj_recon.main.mdl # Condor-G can stage the executable transfer_executable=false # Specify the globus resource to execute the job globusscheduler=tg-login1.sdsc.teragrid.org/jobmanager-pbs # Condor has multiple universes, but Condor-G always uses globus universe=globus # Files to receive sdout and stderr. output=condor.out error=condor.err # Specify the number of copies of the job to submit to the condor queue. queue 1 • Increasing range of expertise needed to tackle the most challenging scientific problems • How many details do you want each individual scientist to need to know? • PBS, RSL, Condor • Coupling multi-scale codes • Assembling data from multiple sources • Collaboration frameworks #! /bin/sh #PBS -q dque #PBS -l nodes=1:ppn=2 #PBS -l walltime=00:02:00 #PBS -o pbs.out #PBS -e pbs.err #PBS -V cd /users/wilkinsn/tutorial/exercise_3 ../bin/mcell nmj_recon.main.mdl +( &(resourceManagerContact="tg-login1.sdsc.teragrid.org/jobmanager-pbs") (executable="/users/birnbaum/tutorial/bin/mcell") (arguments=nmj_recon.main.mdl) (count=128) (hostCount=10) (maxtime=2) (directory="/users/birnbaum/tutorial/exercise_3") (stdout="/users/birnbaum/tutorial/exercise_3/globus.out") (stderr="/users/birnbaum/tutorial/exercise_3/globus.err") ) EGEE 2009, September 21-25, 2009
Gateways democratize access to high end resources • Almost anyone can investigate scientific questions using high end resources • Not just those in the research groups of those who request allocations • Gateways allow anyone with a web browser to explore • Opportunities can be uncovered via google • My 11-year-old son discovered nanoHUB.org himself while his class was studying Bucky Balls • Foster new ideas, cross-disciplinary approaches • Encourage students to experiment • But used in production too • Significant number of papers resulting from gateways including GridChem, nanoHUB • Over 600 Google Scholar references to Robetta • Scientists can focus on challenging science problems rather than challenging infrastructure problems EGEE 2009, September 21-25, 2009
Today, there are approximately 35 gateways using the TeraGrid EGEE 2009, September 21-25, 2009
Not just ease of useWhat can scientists do that they couldn’t do previously? • Linked Environments for Atmospheric Discovery (LEAD) - access to radar data • National Virtual Observatory (NVO) – access to sky surveys • Ocean Observing Initiative (OOI) – access to sensor data • PolarGrid – access to polar ice sheet data • SIDGrid – expensive datasets, analysis tools • GridChem –coupling multiscale codes • How would this have been done before gateways? EGEE 2009, September 21-25, 2009
What makes a gateway a TeraGrid gateway? • TeraGrid gateways that use TeraGrid resources • Are they all developed by TeraGrid? • No, we don’t make gateways the gateways you use, we make the gateways you use better • The strength of the program lies in the development of end user interfaces by those in the community • We work to provide flexible services to meet a variety of needs in a scalable way • TeraGrid does provide staff to assist with gateway use of the resources • Anyone can request support via the same peer review process used to request CPU hours or a data allocation • Are gateways required to be developed with a certain software stack? • No, gateways can be developed using any front end technology • In general, they will use TeraGrid’s common software stack (CTSS) to interact with the TeraGrid • Condor, Globus, myproxy particularly relevant for gateways • http://www.teragrid.org/userinfo/software/ctss_results.php EGEE 2009, September 21-25, 2009
What kinds of technologies do we see in use today? • http://www.teragridforum.org/mediawiki/index.php?title=Science_Gateway_Use_Cases • Not really meant to be a public link, but created as the result of a spring, 2009 survey of gateways to provide feedback to Globus EGEE 2009, September 21-25, 2009
EGEE 2009, September 21-25, 2009 CC=community credential, UC=user credential, DA=dynamic accounts
Observations? • Lots of gridsphere • Support has decreased, I expect to see transitions away from this • TeraGrid User Portal moving from gridsphere to liferay.com • Some JaveCoG • Open Grid Computing Environments (OGCE) • http://www.collab-ogce.org • SimpleGrid • Developed to quickly demonstrate the moving parts of a gateway • Build a gateway in an afternoon • HUBzero • http://www.hubzero.org • See less and less patience from potential developers for complicated deployments • Movement toward Django, Drupal, gadgets EGEE 2009, September 21-25, 2009
OGCE and SimpleGrid tutorialsFeatured at DOE SciDAC meeting, June, 2009 EGEE 2009, September 21-25, 2009
OGCE and Gateways • We develop and package software for use by TeraGrid Science Gateways and other resources • BioVLAB use OGCE tools to run on Amazon • A lot of this comes from active Gateways. • Information Services (GPIR, QBETS): TeraGrid User Portal • Workflow tools: LEAD • Resource Discovery Service, File Browser Applet: TGUP, GridChem • SIDGrid, OLSG • We contribute codes back to these projects. Gateways OGCE Software EGEE 2009, September 21-25, 2009
Objectives • Create a user-centric Web 2.0 environment as science gateway interface • Use TeraGrid to support domain-specific scientific computing • Develop Grid-enabled application services to access TeraGrid capabilities • Understand a GISolve-based workflow to steer analyses on TeraGrid http://gisolve.org/ http://www.cigi.uiuc.edu/doku.php/projects/simplegrid/index
Technologies • Web 2.0 development • JavaScript • AJAX • Yahoo UI (YUI) • Twitter API • Google Map • PHP-based server-side code development • Web service client programming • MySQL database programming
Recent additional technical requirement for gateways • Community accounts used as service accounts to launch jobs for gateway users • Scalable, but doesn’t let TeraGrid discover the number of end users • Must add unique user-identifying attributes, attribute-based authentication • Allows programmatic counting of end gateway users as required by the NSF • Goal is to have this work complete by September, 2009 • We won’t quite make it • http://www.teragridforum.org/mediawiki/index.php?title=Science_Gateway_Credential_with_Attributes EGEE 2009, September 21-25, 2009
Recent area of technical focus:GRAM5 • Stability and scalability are of supreme importance for gateways using TeraGrid • Many improvements to GRAM as a result of heavy gateway utilization on TeraGrid • Led to development of GRAM5 • TeraGrid platform size (60,000 processors on Ranger) also motivate scalability improvements EGEE 2009, September 21-25, 2009
3 steps to connect a gateway to TeraGrid • Request an allocation • Only a 1 paragraph abstract required for up to 200k CPU hours • Register your gateway • Visibility on public TeraGrid page • Request a community account • Run jobs for others via your portal • Staff support is available! • www.teragrid.org/gateways EGEE 2009, September 21-25, 2009
GCE09 at SC09 in PortlandGrid Computing Environments Workshop • Friday November 20, 2009 • http://www.collab-ogce.org/gce09 • Topics include • Integration of Web 2.0 technologies with science gateways • Gateways to cloud computing services • Applications of virtual world technologies to science and education gateways • Grid portals and gateways deployments, including User Portals, Application Portals, Science Gateways, Education Portals, User interface/usability studies • Design and architecture of Grid portals, portal containers, and gateways • Tools and frameworks that make developing Grid Portals and gateways easier • Portal security models and solutions • Middleware solutions in support of scientific portals and Gateways including Web Services, workflow and mashup composers and engines, and similar capabilities • Non-browser gateways: desktops and mobile computing gateways • Summary and survey papers. EGEE2009, September 21-25, 2009
Tremendous Opportunities Using the Largest Shared Resources - Challenges too! • What’s different when the resource doesn’t belong just to me? • Resource discovery • Accounting • Security • Proposal-based requests for resources (peer-reviewed access) • Code scaling and performance numbers • Justification of resources • Gateway citations • Tremendous benefits at the high end, but even more work for the developers • Potential impact on science is huge • Small number of developers can impact thousands of scientists • But need a way to train and fund those developers and provide them with appropriate tools EGEE 2009, September 21-25, 2009
Thank you for your attention!Questions? Nancy Wilkins-Diehr, wilkinsn@sdsc.edu www.teragrid.org EGEE 2009, September 21-25, 2009