240 likes | 407 Views
TeraGrid Science Gateways. Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu. What have the gateways been up to?. Ultrascan Borries Demeler, UT ; Suresh Marru, Raminder Singh, IU Gateway software listing Wrap up of support for Arroyo, RENCI science portal
E N D
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu TeraGrid Rount Table, October 7, 2010
What have the gateways been up to? • Ultrascan • Borries Demeler, UT ; Suresh Marru, Raminder Singh, IU • Gateway software listing • Wrap up of support for Arroyo, RENCI science portal • But hopefully not the end of TG usage by those groups • Dark Energy Survey • Jim Myers, Michelle Gower, NCSA • CUE presentation • Derek Simmel, PSC • login-env, build, comm, math, tg TeraGrid Rount Table, October 7, 2010
GRAM5 • Making use of TG portal user forum for discussion • Interest in sharing experiences with OSG • Update on Inca tests (able to recreate load from “Gateway Debug 2007”) • Gateway experiences – hung processes when errors pile up • SGE job manager issues • Nice work by David Carver (TACC), Suresh Marru (IU), Stu Martin (ANL) • Expressed Sequence Tag gateway • Archit Kulshrestha, IU • CIPRES • Over 600 users on TG Apr-June • 2.7M hours awarded 7/1/10, “model gateway proposal” • But able to use much more than this • Gateways in the extension year • Gateway study TeraGrid Rount Table, October 7, 2010
Analytical UltracentrifugationEmerging computational tool for the study of proteins • Samples from researchers all over the world • Some (Germany, Australia) have their own ultracentrifuges and use only the analysis capabilities, others send samples to UT to spin • Spin the samples at high speeds, learn about macromolecule properties • Monte Carlo simulations • Observations are electronically digitized and stored for further mathematical analysis The Center for Analytical Ultracentrifugation of Macromolecular Assemblies, UT Health Sciences TeraGrid Rount Table, October 7, 2010 Source: Suresh Marru, IU
Comprehensive data analysis environment • Management of analytical ultracentrifugation data for single users or entire facilities • Support for storage, editing, sharing and analysis of data • HPC facilities used for 2-D spectrum analysis and genetic algorithm analysis • TeraGrid (~2M CPU hours used) • Technische University of Munich • Juelich Supercomputing Center • Portable graphical user interface • MySQL database backend for data management • Over 30 active institutions TeraGrid Rount Table, October 7, 2010 Source: Suresh Marru, IU
Gateway and ASTA supporta growing trend • TeraGrid advanced support • Fault tolerance • Workflows • Use of multiple TG resources (using Lonestar, expanding to QueenBee and Ranger, using Quarry for test server, waiting for GRAM5 on Ranger) • Community account implementation • Remote steering • Improved UI (no manual specification of CPU time) • Applying lessons learned from GridChem, LEAD, incorporating new features into OGCE • LEAD is portlet-based, Gridchem is java swing client side app, Ultrascan is php and perl-based gateway, all can use OGCE • Big MPI app that forks off many independent runs, improvements here will be tackled by TG's advanced support team TeraGrid Rount Table, October 7, 2010 Source: Suresh Marru, IU
Gateway software listing • Populate TeraGrid’s information service with gateway software information • Similar to RP software listings • But, RP listings are maintained at RPs, IIS pulls from those sources • With gateways we are thinking they fill in a form and push the info to IIS • http://www.renci.org/~jdr0887/gawsr-howto/ TeraGrid Rount Table, October 7, 2010
Dark Energy Survey • Know universe is expanding, but expansion is accelerating for unknown reasons • DES is telescope experiment to constrain various theories- 4m telescope in Chile, Fermi and others developing new lens, working with simulated data until telescope goes online in 2011 • 200 TB raw data over 5 years, 4 PB of derived products- lots of filtering • Thousands of jobs run on TeraGrid each week with very few failures • Removing light from bright stars, airplanes, clouds, calibration- telescope operated by staff, users will use the portal to do queries for particular stars/regions of the sky afterward TeraGrid Rount Table, October 7, 2010 Source: Jim Myers and Michele Gower, NCSA
Condor dagman, condor-g, pre-ws gram, gridftp, elf/ogrescript for monitoring (developed at ncsa), oracle • Challenges • Efficiently managing small jobs in big batch world • Databases stresses, block updates instead of individual transactions for better performance, indexing strategies, narrow vs wide tables • ~100 front end users, expected to grow in production- changing paradigms from Sloan Digital Sky Survey - data now too large for bulk downloads and full table scans TeraGrid Rount Table, October 7, 2010 Source: Jim Myers and Michele Gower, NCSA
Expressed Sequence Tag (EST) Pipeline • Integrate existing computational biology software • Expand compute capacity by using TeraGrid • Take raw genome data in the FASTA format and run a series of applications on it • RepeatMasker, PaCE, CAP3 and BLAST used to generate the final assembled output • EST Pipeline based on the SWARM Web Service that provides a web service interface to clients and also manages the bulk job submission using the Birdbath API to submit to Condor • Workflow is configured using a PHP based gateway that allows users to upload input data and select programs to run TeraGrid Rount Table, October 7, 2010 Source: Archit Kulshrestha, IU
Expressed Sequence Tag Assembly • ESTs are a collection of random cDNA sequences, sequenced from a cDNA library or sequencing devices. • Typical inputs are of the order of millions of sequences • Newer 454 devices produce higher volume and are relatively easier to obtain and operate • Stored in a file using the FASTA format • The ESTs are clustered and assembled to form contigs. • The contigs are then used to identify potential unknown genes, by Blasting against a known protein database. TeraGrid Rount Table, October 7, 2010 Source: Archit Kulshrestha, IU
Application Runtime Characteristics TeraGrid Rount Table, October 7, 2010 Source: Archit Kulshrestha, IU
Results The results are from a single 2 million job run and hence may not be an accurate model of the wait time. However other than in the case of BLAST the wait times were not a significant component of the total time. Long waits due to long queue times for small jobs. Previous run times – 5 days compared to 2. Serial waits eliminated. Had hooks to inca to determine when jobs were down Failure rate quite low – 10-12 out of thousands TeraGrid Rount Table, October 7, 2010 Source: Archit Kulshrestha, IU
Cyberinfrastructure for Phylogenetic Research (CIPRES) • Enables large-scale phylogenetic reconstructions • Parallel versions of applications such as MrBayes, Raxml and Garli run on Teragrid • Easy to use graphical user interface TeraGrid Rount Table, October 7, 2010
Current Status: CIPRES Portal users consumed 1,200,000 TeraGrid cpu hours between Dec 2009 and June 2010. This was 3 times our projected use. A new award of 2.7 million cpu hours was made on July 1, 2010. The portal provides access to parallel versions of MrBayes, RAxML, and GARLI, which all scale well on TG resources. The portal staff has worked with TG special projects group personnel and community developers to provide access to the fastest versions of MrBayes and RAxML available anywhere. Access to BEST, a variant of MrBayes, is planned in the near future. A GPU platform called BEAGLE will be used to provide access to BEAST on Teragrid (Lincoln), also in the near future. The toolkit will be expanded to provide access to other community codes that are appropriate for use on TeraGrid TeraGrid Rount Table, October 7, 2010 Source: Mark Miller, SDSC
Usage Statistics for CIPRES Portal on TG 12/1/2009 – 5/31/2010 TeraGrid Rount Table, October 7, 2010 Source: Mark Miller, SDSC
Intellectual Merit: • the CIPRES portal is cited in at least 35 publications • this includes publications in Nature, PNAS, and Cell. • highlights of scientific findings: • New Family Tree for Arthropoda: A team of scientists compared genetic sequences from 75 arthropod species and drew a new family tree for the most successful phylum of animals on Earth. This work represents an important advance in the century-old problem of arthropod evolution. • Genome Sequence of a Transitional Eukaryote: A group of scientists sequenced the genome of Naegleria gruberi, a single-cell organism that is a key transitional species between prokaryotes and eukaryotes. This work provides new insights into the origins of subcellular organelles. • Co-evolution of Beetles and Flowering Plants: A group of researchers studied the evolutionary history of angiosperms and the beetles that interact with them. The work provided compelling experimental evidence for the long-postulated co-evolution of these two symbiotic groups. TeraGrid Rount Table, October 7, 2010 Source: Mark Miller, SDSC
Broad Impacts: • 77% of all jobs have been submitted from locations in the USA. Submissions are received regularly from researchers at top-tier institutions such as Harvard, Yale, and Stanford. • Jobs are received regularly from academic institutions in 17 EPSCOR states. • Job submissions have been received from 34 countries on 5 continents. • At least 5 undergraduate classes are known to use the portal routinely. This is likely an underestimate (based on Web log patterns). • More than 45,000 jobs have been run on the Portal over its lifetime. Between Dec 1, 2010 and June 30, 2010, users ran 6,108 parallel jobs on the TeraGrid. TeraGrid Rount Table, October 7, 2010 Source: Mark Miller, SDSC
Broad Impacts: Impacts on Productivity: Average wall time for RAxML and GARLI jobs decreased 3-4 fold with the shift to TeraGrid resources. Moreover, the number of RAxML jobs has doubled relative to the rate of submission on the CIPRES Portal running on the CIPRES cluster alone. Thus, TeraGrid access is helping users finish their jobs faster and also to make more runs per unit time. The average wall time for MrBayes jobs increased 2-fold on the TeraGrid, but the number of jobs decreased by approximately 33%. This trend reflects users’ ability to run much larger and longer jobs on TeraGrid than on the CIPRES cluster. The increased maximum run-time limit for MrBayes submissions to Abe (168 hours on Abe vs. 72 hours on the CIPRES cluster) allowed users to complete their long runs with a single large submission, thus eliminating the need to make smaller, incremental runs. TeraGrid Rount Table, October 7, 2010 Source: Mark Miller, SDSC
Broad Impacts: Improved User Access to TG: 100 – 150 new users per month access TG resources; the number of repeat users is growing…. TeraGrid Rount Table, October 7, 2010 Source: Mark Miller, SDSC
New gateway activities in the extension year • Helpdesk support expanded • From .2 FTE in PY5 to 1.7 in Extension [NCSA, Purdue] • Helpdesk and Condor support, new GIS communities, SimpleGrid extensions • Accounting • Improved views for gateways now that we have attributes [TACC] • Community accounts • Continued work toward improved standardization [NICS] • Prebuilt VMs with gateway software • OGCE, SimpleGrid [IU, NCSA] • Online tutorials with CI Tutor and the EOT team • OGCE, SimpleGrid [IU, NCSA] • More example-based documentation • Less talk, more action, short videos, based on user feedback [NCSA, SDSC] • Remote vis for gateways [ORNL] TeraGrid Rount Table, October 7, 2010
Targeted Support in the ExtensionAll staff available for assignments as new projects come in • Cactus • Meet the needs of several groups with large TG allocations [LSU] • GridChem, PolarGrid, Ultrascan • Scheduling, vis, Matlab processing, processing of centrifuge data for large international project [IU] • CCSM-ESG • Continuing work to combine capabilities [NCAR, Purdue] • Uintah, computational fluids [NCAR, Utah] • SNS [ORNL] • CIPRES [SDSC] • OpenSocial for gateways [U Chicago] • Improved use of remote vis resources [ORNL] • Condor and cloud support [Purdue] TeraGrid Rount Table, October 7, 2010
Gateway Sustainability StudySmall, non-TG, EAGER grant • Characteristics of short funding cycles • Build exciting prototypes with input from scientists • Work with early adopters to extend capabilities • Tools are publicized, more scientists interested • Funding ends • Scientists who invested their time to use new tools are disillusioned • Less likely to try something new again • Start again on new short-term project • Need to break this cycle • EAGER grant to look at characteristics of successful gateways and domain areas where a gateway could have a big impact • 4 focus group meetings over 2 years • First 2 held June, 2010 • www.sciencegateways.org TeraGrid Rount Table, October 7, 2010
Thank you for your attention!Questions? Nancy Wilkins-Diehr, wilkinsn@sdsc.edu www.teragrid.org TeraGrid Rount Table, October 7, 2010