230 likes | 299 Views
The LDCM Grid Prototype. Jeff Lubelczyk & Beth Weinstein. January 4, 2005. Prototype Introduction. A Grid infrastructure allows scientists at resource-poor sites access to remote resource-rich sites Enables greater scientific research Maximizes existing resources
E N D
The LDCM Grid Prototype Jeff Lubelczyk & Beth Weinstein January 4, 2005
Prototype Introduction • A Grid infrastructure allows scientists at resource-poor sites access to remote resource-rich sites • Enables greater scientific research • Maximizes existing resources • Limits the expense of building new facilities • The objective of the LDCM Grid Prototype (LGP) is to assess the applicability and effectiveness of a data grid to serve as the infrastructure for research scientists to generate virtual Landsat-like data products
Sponsors LDCM - Bill Ochs, Matt Schwaller Code 500/580 - Peter Hughes, Julie Loftis LGP Team members Jeff Lubelczyk (Lead) Gail McConaughy (SDS Lead Technologist) Beth Weinstein (Software Lead) Ben Kobler (Hardware, Networks) Eunice Eng (Software Dev, Data) Valerie Ward (Software Dev, Apps) Ananth Rao ([SGT] Software Arch/Dev, Grid Expertise) Brooks Davis ([Aerospace Corp] Globus/Grid Admin Expert) Glenn Zenker ([QSS] System Admin) USGS Stu Doescher (Mgmt) Chris Doescher (POC) Mike Neiers (Systems Support) Science Input Jeff Masek, 923 (Blender) Robert Wolfe, 922 (Blender, Data) Ed Masuoka, 922 (MODIS, Grid) LDCM Prototype Liaison Harper Prior (SAIC) CEOS grid working group (CA) Ken McDonald Yonsook Enloe [SGT] LGP Key POCs
User Client Application • Grid Middleware • Security (Authentication, Authorization) • Resource Discovery • Storage Management • Scheduling and Job Management Compute Storage Compute Storage Compute Storage East Coast/Platform C West Coast/Platform A On Campus/Platform A Grid - A Layer of Abstraction • Grid Middleware packages the underlying infrastructure into defined APIs • A common package is the Globus Toolkit • Open source, low cost, flexible solution
What the current data grid provides • Security Infrastructure • Globus Gate Keeper • Authentication (PKI) • Authorization • Resource Discovery • Monitoring and Discovery Service (MDS) [LDAP like] • Storage Management and Brokering • Metadata catalogs • Replica Location Service • Allows use of logical file names • Physical locations are hidden • Storage Resource Management • GridFTP • Retrieves data using physical file names • Data formats and subsetting • Job Scheduling and Resource Allocation • GRAM (Globus Resource Allocation Manager) -- Provides a single common API for requesting and using remote system resources Globus Tookit 2.4.2 Globus Gate keeper GridFTP GRAM Note: Portions of the Globus Toolkit used in Capability 1
High Level Schedule • Major Milestones • 12/03 - Prototype start • 6/04 - Demo of Capability 1 grid infrastructure • Demonstrate simple file transfers and remote application execution at multiple GSFC labs and USGS EDC • Ready to build application on top of basic infrastructure • 12/04 - Demo of Capability 1 • Provide and demonstrate a grid infrastructure that enables a user program to access and process remote heterogeneous instrument data at multiple GSFC labs and USGS EDC • 3/05 - Demo of Capability 2 grid infrastructure • Demonstrate file transfers and remote application execution at multiple GSFC labs, USGS EDC, and ARC/GSFC commodity resources to assess scaleability • 6/05 - Demo of Capability 2 • Enable the data fusion (blender) algorithm to obtain datasets, execute, and store the results on any resource within the Virtual Organization (GSFC labs, USGS EDC, ARC/GSFC)
The LDCM Demonstration … • Prepares two heterogeneous data sets at different remote locations for like “footprint” comparison from a science user’s home site • The MODIS Reprojection Tool (MRT) serves as our “typical science application” developed at the science users site (Building 32 in demo) • mrtmosaic and resample (subset and reproject) • Operates on MODIS and LEDAPS (Landsat) surface reflectance scenes • Data distributed at remote facilities • Building 23 (MODIS scenes) • USGS/EDC (LEDAPS scenes) • Solves a realistic scientific scenario using grid-enabled resources
Capability 1 Virtual Organization • Dell/Linux Server • Dual Xeon Processors • 8 GB Memory • 438GB Disk Storage edclxs66 USGS EDC Sioux Falls, SD • Dell/Linux Server • Quad Xeon Processors • 16 GB Memory • 438GB Disk Storage LGP23 GSFC B23/W316 1 Gbps USGS/EDC 1 Gbps Backbone MAX (College Park) OC48, 2.4Gbps Backbone GSFC SEN 1Gbps Backbone 1 Gbps USGS/EDC OC12, 622 Mbps Shared with DREN OC12, 622 Mbps 1 Gbps vBNS+ (Chicago) OC48, 2.4Gbps Backbone LGP32 Science User_1 GSFC B32/C101 • Dell/Linux Server • Dual Xeon Processors • 8 GB Memory • 438GB Disk Storage GSFC Capability 1 SEN: Science and Engineering Network MAX: Mid-Atlantic Crossroads DREN: Defense Research and Engineering Network vBNS+: Very high Performance Network Service Installed Equipment
A “Typical Science Application” • MODIS Reprojection Tool (MRT) • Software suite distributed by LP DAAC • Applications used include • mrtmosaic.exe • Create 1 scene from adjacent scenes • resample.exe (Subset) • Geographic • Band/Channel • Projection • Each operate on MODIS and LEDAPS scene data • Visualization Tool -- Software to display scenes • HDFLook
Data • MODIS - MOD09GHK • MODIS/Terra Surface Reflectance Daily L2G Global 500m SIN Grid V004 • Sinusoidal projection • 7 Scenes • Washington D.C. (H = 11,12, V = 5) • Pacific NW (H = 9, V = 4) • Obtained from LP DAAC ECS Data Pool • LEDAPS - L7ESR • LEDAPS Landsat-7 Corrected Surface Reflectance • UTM projection • 2 Scenes • Washington D.C. (Path = 15, Row = 33) • Pacific NW areas (Path = 48, Row = 26) • Obtained from LEDAPS website • Both compatible with the MRT • All like-area scenes are as temporally coincident as possible
4 Scenarios to Illustrate Grid Flexibility • Data Services(Move application to data) • Transfer the MRT to the remote hosts and process the data remotely, sending the results back to the science facility • Batch Execution(Parallel computing) • Demonstrate the execution of the MRT in a parallel batch environment • Local Processing(User prefers to process locally) • Transfer the selected data sets to the science user site for processing • Third Party Processing(No local resource usage) • Perform a third party data transfer and process the data remotely Grid flexibility maximizes science resources
How we make this happen • Command line interface to execute the LDCM Grid Prototype (LGP) driver program • The LGP Driver • Manages the execution of a specified application • Transfers the application and data as needed • Uses configuration files as inputs to describe: • The executable and its location • The data sets and their location • The location of the resulting output file(s)
LDCM Grid Prototype (LGP) Driver Provides a generic software system architecture based on Globus services LGP Driver high-level services Session Manager – grid session initiation and user authentication using proxy certificates Data Manager – file transfer using GridFTP Job Manager – job submission and status in a grid environment Utilizes the Java Commodity Grid Kits (CoGs) Supplies a layer of abstraction from underlying Globus services Simplifies the programming interface Capability 1 Software Framework LGP Driver (Java 1.4.2) Session Data Job Java CoG 1.1 Globus Tookit 2.4.2 Globus Gate keeper GridFTP GRAM
Demo Scenario 1: Input Data LEDAPS - L7ESR MODIS - MOD09GHK
USGS EDC GSFC B23 data.txt LEDAPS MOD09GHK GSFC B32 MRT Demo Scenario 1: Data Services LDCM VO 5) Run mrtmosaic.exe with 2 MOD09GHK files 7) Run resample.exe on mrtmosaic output 2) Run resample.exe on 1 LEDAPS file 1) Move resample.exe from B32 to EDC 8) Move mrtmosaic resample output from B23 to B32 4) Move mrtmosaic.exe from B32 to B23 6) Move resample.exe from B32 to B23 3) Move LEDAPS resample output from EDC to B32 9) Display LEDAPS and MODIS resampled output using HDFLook Grid Node GT 2.4.3 GridFTP Server
USGS EDC GSFC B23 MODIS Mosaic Subset MODIS Mosaic MOD 09GHK MOD 09GHK data.txt MODIS Mosaic MODIS Mosaic MODIS Mosaic LEDAPS Subset LEDAPS MRT MRT MRT MOD09GHK resample resample resample GSFC B32 MODIS Mosaic LEDAPS resample mrtmosaic mrtmosaic MRT Demo Scenario 1: Data Services LDCM VO 2) Run resample on 1 LEDAPS file 5) Run mrtmosaic with 2 MOD09GHK files 7) Run resample on mrtmosaic output 8) Move mrtmosaic resample output from B23 to B32 6) Move resample from B32 to B23 4) Move mrtmosaic from B32 to B23 LEDAPS – L7ESR MODIS – MOD09GHK 1) Move resample from B32 to EDC 9) Display LEDAPS and MODIS resampled output using HDFLook 3) Move LEDAPS resample output from EDC to B32
Capability 1 Task RequirementsCompleted • Science user is at B32 and the data is at EDC and B23 • 2 - 3 instrument types • 10 - 20 scenes • Spatially and temporally coincident data • Algorithm must run on B23, B32, and EDC • Command-line invocation from client side • Perform distributed computation • Share distributed data Verified by executing the 4 scenarios
Next Steps -- Capability 2 • Capability 2 (C2) • Integrate with the Blender team • Collaborate to identify meaningful C2 data sets • Demonstrate blender algorithm • Assess Grid performance • Expand the VO to include ARC supercomputing if available • Performance Goals • Demonstrate the processing of 1 day’s worth of data in the grid environment (~250 scenes) • Grid Workflow -- increase automation
Grid Workflow • Our current capabilities allow us to submit jobs only to a specified resource • The goal of the next phase will be to provide the ability to submit a job to the “Grid” Virtual Organization • Grid resource management • Scheduling policy • Maximize grid resources • Manage sub tasks • Reliable job completion • Checkpointing and job migration • Leverage wasted cpu cycles • Next step: Examine Condor and Pegasus open source Globus toolkit workflow extensions
Concept of a Future Grid Architecture - LDCM example Scientist Data Site Landsat Data MODIS VIRS & other USGS/EDC DAAC NASA/GSFC Reflectance Pdts VO Grid Operator’s Interface VO Grid Resource Status Data Node/Manager Research Community Data Server Failure Recovery VO Grid Config Univ. of MD Grid Workflow Engine Job Manager Data Node/Manager Grid Resource Manager Data Manager Overall V0 Grid Management Science Product Interface Data Product Distribution Session Manager Research Community Existing C1 Grid Infrastructure Research Archive Proposed C2 Grid Infrastructure Product Def. <Blender> Product Status & Recovery Future Grid Components
Acronym List • FTP File Transfer Protocol • LDCM Landsat Data Continuity Mission • LEDAPS Landsat Ecosystem Disturbance Analysis Adaptive Processing System • LGP LDCM Grid Prototype • LP DAAC Land Processes Distributed Active Archive Center • MODIS Moderate Resolution Imaging Spectroradiometer • MRT MODIS Reprojection Tool
Condor, Condor-G, DAGman • Condor addresses many workflow challenges for Grid applications. • Managing sets of subtasks • Getting the tasks done reliably and efficiently • Managing computational resources • Similar to a distributed batch processing system, but with some interesting twists. • Scheduling policy • ClassAds • DAGman • Checkpointing and Migration • Grid-aware & Grid-enabled • Flocking (linking pools of resources) & Glide-ins • See http://www.cs.wisc.edu/condor/ for more details Chart author: lee limingargonne national laboratory
Pegasus Workflow Transformation Converts Abstract Workflow (AW) into Concrete Workflow (CW). • Uses Metadata to convert user request to logical data sources • Obtains AW from Chimera • Uses replication data to locate physical files • Delivers CW to DAGman • Executes using Condor • Publishes new replication and derivation data in RLS and Chimera (optional) • See http://pegasus.isi.edu/ for details ChimeraVirtual DataCatalog MetadataCatalog t DAGman ReplicaLocationService Condor ComputeServer StorageSystem ComputeServer StorageSystem StorageSystem ComputeServer ComputeServer Chart author: lee limingargonne national laboratory