NMI Testbed GRID Utility for Virtual Organization

NMI Testbed GRID Utility for Virtual Organization Art Vandenberg Avandenberg@gsu.edu Director, Advanced Campus Services Georgia State University

NSF Supported • This material is based in part upon work supported by the National Science Foundation under • Grant No. ANI-0123937 and • Grant No. ITR-0312636. • Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).

Overview • NMI Testbed GRID – “virtual organization” • Participating sites • Resources for VO • Catalog of grid applications Example: genome alignment for VO Plans for May-August 2004

Vision – NMI Testbed GRID VO • NMI Integration Testbed Program • NSF #ANI-0123937 • Explore grid capability – interoperability • Researchers & faculty • Across heterogeneous sites • Integrated with enterprise middleware • A utility grid using NMI components • Non-specialized, open, transparent

Collaborative environment VO • Beyond application specific grids • Leverage enterprise middleware • Identity management, authN, authZ... • Strive for transparent access • Portals • Ease of use: submit, monitor, retrieve data • Security policy & technology • Federation of cooperating sites

Your Grid is here, now

We want VO utility grid to be here...

Participating sites - the VO • Testbed sites – push interoperation limits • Georgia State University • Texas Advanced Computing Center • University of Alabama at Birmingham • University of Alabama at Huntsville • University of Michigan • University of Southern California • University of Virginia

Site resources – VO • Testbed sites – interoperation challenges • GSU: Shibboleth, GridPort portal, REU & Grads, disk • TACC: REU student, portal, Enterprise CA, cluster • UAB: beowulf cluster, CA, Pubcookie, OGCE portal • UAH: application expertise, NASA IPG Certs • UMich: KX.509 & Kerberos, MGrid, ATLAS integration • USC: CA, Pubcookie, Shibboleth, Linux cluster, KX.509 • UVa: Bridge CA model • Sites non-homogeneous – a VO challenge

Catalog of grid applications • Knowledge base is important • REU students – Nicole Geiger, Anish Shindore • Graduate Research Asst – Manish Garg • NMI Testbed Sites initially • Researchers, schools, projects • Grid specific as well as grid potential • Started as spreadsheet, now online db

Catalog of grid applications • Catalog of Grid Applications (current version) • http://art12.gsu.edu:8080/grid_cat/index5.jsp • Expanding scope beyond testbed sites • 18 schools/labs, 300 researchers & counting • Differentiated from Globus www.gpds.org • Oriented to researcher, institutional level • Planning clustering, visualization modality • Clustering work related to: NSF #ITR-0312636

Example: genome alignment for VO(GSU – UAB) • An opportunity for utility Grid VO • Nova Ahmed, CS grad with Dr. Yi Pan, GSU • dynamic programming algorithm for genome sequence alignment • Initial runs on GSU shared memory hydra • Limited access (grad student, shared cycles) • Algorithm improvement using multi-processor cluster across a grid?

The Genome Alignment Problem • Alignment of DNA sequences • Sequence X: TGATGGAGGT • Sequence Y: GATAGG • Count the matching score as • 1 => matching • 0 => non-matching • Populate the Similarity matrix using: Observation re Similarity Matrix: • Many zero values • Reduction of memory possible by reducing zero value elements

Improved Parallel Algorithm for Genome Alignment • The parallel Method: • Similarity matrix is divided among processors • Processors calculate in parallel to match the partial sequence • Communication is done among the processors to match the whole sequence • The new Data Structure: • New algorithm calculates only non-zero values of the similarity matrix • Memory is dynamically allocated as needed

Results on the Shared Memory Machine (Hydra) • Limitations • Can not allocate memory for long sequences Ex: Largest sequence to align is 2000 x 2000 • Number of processors are limited Ex: For Hydra 12 processors • Not scalable Performance Computation time decreases with increased number of processors

Using the beowulf cluster: Longer genome sequence can be aligned Highest sequence length can be 10,000 in the cluster Limited scalability Can increase the number of processors up to a certain limit Results on the Beowulf Cluster of UAB

Results via the GRID at UAB Submitting genome alignment program using Globus and MPICH • Advantages: • Scalable – Can add new clusters to the grid • Easier job submission – Don’t need account on every node • Scheduling is easier – Can submit multiple jobs at one time

Future Work Genome Alignment Use MPICH-G2 (instead of MPICH) – • Use the power of Grid Expand the computational resources – • Combine more clusters across the Grid Develop program to align Multiple Genome Sequences (rather than two at a time) – • Requiring more computation resources Use Georgia State certificate via Bridge CA • Via Shibboleth protected sector CA…?

Plans for May-August 2004 • More resources • Contributed from current sites (others?) • Portal for NMI Testbed GRID • Cf. NPACI Hotpage https://hotpage.npaci.edu/ • Integration of campus authN • UVa Bridge CA • More applications • Utility grid for grad research & education

Plans for May-August 2004… • Documentation • Web site • Application docs and demos • Catalog of Grid Applications • Provide for self service contribution • Develop clustering (SOM), visualization options (“find researchers or projects like X”) • Auto-discovery of Grid researchers & apps based on reference sets (core sites)?

Contact Information • Art Vandenberg • Avandenberg@gsu.edu • NMI Testbed GRID • http://www.gsu.edu/~wwwacs/GRID_Group/NMI.html

NMI Testbed GRID Utility for Virtual Organization

NMI Testbed GRID Utility for Virtual Organization

Presentation Transcript

GUI For A Virtual Pipeline Simulation Testbed

WIRELESS GRID INNOVATION TESTBED

Visions of the Smart Grid: Deconstructing the traditional utility to build the virtual utility

SURA NMI Utility Grid: Sharing Resources, Sharing Results SURA Cyberinfrastructure Workshop:

Virtual Power-system Testbed

The NMI Integration Testbed

Towards Virtual Networks for Virtual Machine Grid Computing

Grid testbed

INFN-GRID Testbed Monitoring System

ESA CAT-1 inter-Grid testbed

NMI Testbed GRID Utility for Virtual Organization

Grid based Flood Prediction Virtual Organization

UAB NMI Testbed Program: Integrated Directory Services  Grid Computing

Deploy Grid testbed in Israel universities

FutureGrid: a Grid Testbed

Michigan Grid Testbed Report

Virtual Organization Approach for Running HEP Applications in Grid Environment

NMI Integration Testbed: “Experiences in Middleware Deployment”

NMI Testbed Activities at Virginia

Grid services monitoring EMI IPv6 testbed

The NMI Integration Testbed

GUI For A Virtual Pipeline Simulation Testbed