230 likes | 394 Views
Purdue Multidisciplinary Data Management Framework Using SRB. Lan Zhao Taezoon Park Rajesh Kalyanam Wonjun Lee Sebastien Goasguen. Outline. Motivation Data Management System Data Access Data Collections Metadata Design Future Work. Motivation. Challenges:
E N D
Purdue Multidisciplinary Data Management Framework Using SRB Lan ZhaoTaezoon ParkRajesh KalyanamWonjun LeeSebastien Goasguen
Outline • Motivation • Data Management System • Data Access • Data Collections • Metadata Design • Future Work
Motivation • Challenges: • Rapidly-growing scientific data volume and sources • Real-time streaming data from sensors, satellites, and radars • Goal: • Easy to discover, access, and share the data in a timely manner • Extensible system, apply to data across disciplines
Data Management System • TeraGrid Network • SRB Middleware developed at SDSC • Provide uniform access to distributed heterogeneous data resources • Advanced data management features: data replication, access control, fault tolerance, high performance data movement, metadata catalog, etc. • Components: MCAT-enabled SRB Server (SDSC), Non-MCAT-enabled SRB Server (Purdue), Unix File System Storage (Purdue), Centera Storage System (Purdue)
Data Management System (II) • OPeNDAP Data Server • Distributed Oceanographic Data System • Make remote scientific data accessible over the internet • Supported client applications: IDV, MATLAB, Excel, and more • THREDDS Data Server • Web server, metadata and data access for scientific datasets • Dynamically generated dataset catalogs • Data Portal • Gridsphere based, JSR-168 compliant, SRB portlets
Data Access • Command Line Interface: S-commands • Configure .MdasEnv and .MdasAuth //Login information • Sinit //Open connection • Sls –rl –c “ATTRCONDD Title like ‘Landsat*’ //search for all Landsat images • Sget $file //download data to local disk • Sexit • Web Interface: MySRB • Windows GUI Client: inQ • Customized client using SRB libraries
Data Access (II) • Purdue Environmental Data Portal • JSR-168 compliant • GridSphere platform • Customized SRB portlets • Easy access to data collections • Browse • Search • Download • Quickview • OPeNDAP and THREDDS clients
LARS Dataset (Laboratory for Applications of Remote Sensing) • Multispectral and Hyperspectral remote sensing images for Indiana • ERDAS LAN, Leica Geosystems Imagine, GeoTIFF, and HDF formats • 1972 to 2004 • IndianaView Glovis web access • Part of the AmericaView initiative • Funded through USGS • Graphical Interface for viewing and downloading remote sensing image data • http://indianaview.envision.purdue.edu/glovis/index.htm
PTO Satellite Data(Purdue Terrestrial Observatory) • GOES-GVAR sensor (L band), 3.7m. fixed antenna, Feb. 2005. • Terra-MODIS, Aqua-MODIS, NOAA-AVHRR and FY1-MVISR sensors (L- and X- band), 4.27 m. tracking antenna , Feb. 2006. • 10 Node cluster data processing and visualization server, more than 25 different products.
PTO Satellite Data (II) • Potential Data Volume • Aqua and Terra MODIS sensor: • 10 - 20 GB per day • NOAA AVHRR sensor: • 0.5 GB per day • GOES GVAR sensor: • 20 GB per day • The most recently acquired GOES-12 images for Indiana region are available at: http://indianaview.envision.purdue.edu/pto/example_images.html
PTO Satellite Data (III) MODIS Sensor (36 channels), Spatial Resolution: 1-2: 250m; 3-7:500m; 8-36: 1000m
National Weather Service Data • Next Generation Radar (NEXRAD) Level II data • 159 Weather Surveillance Radar-1988 Doppler (WSR-88D) sites • Real-time streaming, high-resolution data from the national network • Reflectivity, mean radial velocity, and spectrum width • One of the four top-level distributors • THREDDS data server to provide metadata and data access
CCSM Climate Simulation Data • Community Climate System Model (CCSM) to simulate climate change on Earth • Ocean, Land, and Atmospheric models • NetCDF format • OPeNDAP server provides post-processing functionalities • EMC Centera storage system, 32TB, fast easy access to online data
Metadata Design • Metadata files follow Federal Geographic Data Committee (FGDC) standard for digital geospatial metadata including extensions for remote sensing metadata. • Includes the following: • Identification Information • Data Quality Information • Spatial Data Organization Information • Spatial Reference Information • Entity and Attribute Information • Distribution Information • Metadata Reference Information.
Example of LARS Metadata(in xml format) <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE metadata SYSTEM "http://www.fgdc.gov/metadata/fgdc-std-001-1998.dtd"> <metadata> <idinfo> <citation> <citeinfo> <origin>Purdue University/LARS</origin> <pubdate>2001</pubdate> <title>L7_20000325_022_032.tif</title> <geoform>remote-sensing image</geoform> </citeinfo> </citation> <descript> <abstract>This Landsat 7 data set contains 8 channels, the 6 reflective channels plus the 2 thermal channels. The 60 meter thermal channel data have been spatially replicated so that each each pixel represent 30 meters to match the 30 meter spatial sampling interval for each pixel in the reflective data. This product is a precision geocorrection data set projected to the UTM projection using nearest neighbor resampling from the original data.</abstract> …
Future Work • Real-time streaming data support • Work flow management • Metadata design • 2005 statewide Orthophotography data • Web services – allow other tools to access the data management system
References • IndianaView Portal: http://indianaview.envision.purdue.edu/glovis/index.htm • LARS: http://www.lars.purdue.edu/ • OPeNDAP: http://opendap.org/ • Pudue Terrestrial Observatory (PTO): http://www.itap.purdue.edu/pto/ • Purdue Environmental Data Portal: http://gridsphere.rcac.purdue.edu/gridsphere • Purdue TeraGrid: http://www.purdue.teragrid.org/ • SRB: http://www.sdsc.edu/srb/ • THREDDS: http://www.unidata.ucar.edu/projects/THREDDS/