190 likes | 200 Views
Providing development and support for TeraGrid projects such as CIMA and VLAB at Indiana University. Enhancing access, storage, and reliability of collected data. Future plans include resilient services and virtual hosting environment.
E N D
IU TeraGrid Gateway Support Marlon Pierce Community Grids Lab Indiana University
Personnel • Marlon Pierce: project leader • Sangmi Lee Pallickara: senior developer • Lead on VLAB and File Agent Service development • Yu “Marie” Ma: senior developer • Lead on CIMA support • Rishi Verma: student intern • Software release and testing technician
Team Strategy • Provide general purpose gateway support through software delivered through the NSF-funded Open Grid Computing Environments project. • Provide short term development support to new TeraGrid gateway projects to help them integrate with resources. • IU’s CIMA instrument project (R. McMullen) • Minnesota’s VLAB project (R. Wentzcovitch) • IU School of Medicine’s Docking and Scoring Portal (S. Meroueh • IU/ECSU Polar Grid project
CIMA Project Overview • Common Instrument Middleware Architecture project • NSF funded project to develop software and portals for instruments. • Dr. Rick McMullen PI • Flagship project: crystallography portal and services for collecting and archiving real time crystallography data. • Gateway for data collected at 10 crystallography labs in the US, UK, and Australia. • Problems: • Much of the collected data is private and should only be accessed by the owners. • Data must be stored on large, highly available file systems. • Services must be highly reliable.
Gateway Team Support for CIMA (Y. Ma) • Existing CIMA project was converted into a TeraGrid gateway. • CIMA now provides secure access to CIMA archives • Using IU Data Capacitor for storage • Security through GridFTP and a TeraGrid community credential. • Marie Ma also led the development work for CIMA High Availability testing. • SC07 demo • Future Work: • Support follow-on NSF funded Crystal Grid project. • Use CIMA as a test case to explore virtual hosting and other data grid strategies.
Users get gateway credentials with normal login. Experiments are grouped into samples with private access. Sample data (images, metadata) are securely retrieved from TeraGrid storage.
High Availability CIMA • Prototypes fail-over services and portals for TeraGrid gateways. • Demonstrated resilient services for multiple scenarios: • Application (Web service) failures. • Operating system failures • Partial and complete network failures • WAN file system failures
Future CIMA/Crystal Grid Support • IU is setting up a virtual hosting environment for Gateways and TeraGrid Web services. • Dave Hancock will describe this in an upcoming talk. • We are prototyping this for CIMA. • Provide the Gateway perspective. • Dave will provide the integrator perspective/
VLAB Project Overview • U-Minn VLAB project is an NSF ITR funded project for investigating properties of planetary materials under extreme conditions. • Prof. Renata Wenztcovitch, PI • Very computationally intense (determinng phase diagrams of materials) • Potentially 1000’s of jobs medium-large parallel jobs. • VLAB also develops services and portals for managing the complicated runs. • Problem: existing VLAB services for task management needed to be integrated with the TeraGrid. • The service also needed to be more easily extensible to many different scheduling/queuing systems.
Gateway Team Support for VLAB (S. Pallickara) • Modified VLAB’s Task Executor Web Service to work with TeraGrid GRAM servers. • New Task Executor code built around Condor-G and Condor Birdbath Web Service Java language clients. • Testing with both serial and parallel versions of VLAB workhorse code (“PWSCF”) on TACC’s Lonestar, NCSA’s various metal machines, ORNL’s cluster. • This code also formed the basis of support for the NASA-funded QuakeSim project and will be packaged and released for general use. • Next step: integrate more complicated of VLAB’s major codes (“Phonon”).
Auxiliary Services (Phonon Input prep, High T post processing, etc.) Databases (Metadata, Session registry, etc.) Portal Project Executor Project Interaction TeraGrid Information Service Task Dispatcher Task Executor Task Executor Task Executor TeraGrid Task Interface TeraGrid Task Interface TeraGrid Task Interface Lonestar(TACC) Dell PowerEdge Linux Cluster 5840 CPUs 62.6 Peak TFlops Tungsten(NCSA) Dell Xeon IA32 Linux Cluster 2560 CPUs 16.38 Peak TFlops Cobalt(NCSA) SGI Altix 1024 CPUs 6.55 Peak TFlops NSTG(ORNL) IBM IA-32 0.34 Peak TFlops
Modified VLAB’s Task Executor Web Service to work with TeraGrid GRAM servers. • - New Task Executor code built around Condor-G and Condor Birdbath Web Service Java language clients. • - Testing with both serial and parallel versions of VLAB workhorse code (“PWSCF”) on TACC’s Lonestar, NCSA’s various metal machines, ORNL’s cluster. Task Executor TeraGrid Task Interface Using Condor Birdbath WebService API Condor G Job Submission GRAM JobManager GRAM JobManager GRAM JobManager GRAM JobManager LSF Batch system PBS Batch system LSF Batch system PBS Batch system Lonestar(TACC) Dell PowerEdge Linux Cluster 5840 CPUs 62.6 Peak TFlops Tungsten(NCSA) Dell Xeon IA32 Linux Cluster 2560 CPUs 16.38 Peak TFlops Cobalt(NCSA) SGI Altix 1024 CPUs 6.55 Peak TFlops NSTG(ORNL) IBM IA-32 0.34 Peak TFlops
Lessons from the VLAB Job Submission Example • For the VLAB application, multiple input files and multiple output files were required to be transferred between the Teragrid clusters and the TaskExecutor service. • Using CondorG provided us a reasonably unified mechanism. However, each of the TeraGrid clusters provides a batch system, which requires different setups for the executables. • Some of the system environments were not setup properly • Scripts generated by jobmanager-lsf on Lonestar, for example, override custom $PATH • Tackling each of these problems were not trivial, but we did get enthusiastic support from all the TeraGrid sites that we dealt with.
Scoring and Docking Gateway • Users develop scoring functions for the ability of drug-like molecules to dock to proteins. • Then need Quantum Chemistry techniques to refine technique. • AMBER • We are adapting our Condor-G based Web services to build an AMBER Grid Service. Samy Meroueh, IU School of Medicine
General Purpose Gateway Software (S. Pallickara) • TeraGrid community credentials are used with GridFTP to access community archives. • Ex: Data Capacitor, HPSS mass storage • Problem: We need a way to enforce additional community restrictions on these files. • Users should have restricted file spaces. • Solution: express and enforce access restrictions to community files through the Web gateway • Software: File Agent Service and updated File Manager portlet developed and released through the OGCE web site. • Targeted for the DC and HPSS
Portlet (modified from OGCE code base) allows file system views of DC, HPSS, and other GridFTP accessible resources. Portlet enforces additional restrictions on community users to keep their data separate and private from other users.
PolarGrid: Microformats, KML, and GeoRSS feeds used to deliver SAR data to multiple clients.
Out of Scope Items • We do not currently develop, deploy, or maintain general purpose services for TG resource providers. • TG Information Services (J. P. Navarro) and TG User Portal do this. • We do collaborate with these groups through the OGCE project. • This could change if we have clear requirements for this. • We rely on existing resource provider infrastructure such as Globus GRAM and GridFTP. • We don’t install or maintain these.
Project Blogs • Get a snapshot of what we are working on: • Sangmi: http://sangpall.blogspot.com/ • Marie: http://tethealla.blogspot.com/ • Rishi: http://gridportal-lab.blogspot.com/ • Marlon: http://communitygrids.blogspot.com/