10 likes | 112 Views
Online Data Distribution. Field Data Access. A Cloudy View on Computing workshop and CReSIS Field Data Accessibility Jerome Mitchell 1 , Jun Wang 1 , Geoffrey Fox 1 , Linda Hayden 2 Indiana University 1 , Elizabeth City State University 2. WMS. Matlab/GIS. Single User. GIS Cloud Service.
E N D
Online Data Distribution Field Data Access A Cloudy View on Computing workshop and CReSIS Field Data Accessibility Jerome Mitchell1, Jun Wang1, Geoffrey Fox1,Linda Hayden2Indiana University1, Elizabeth City State University2 WMS Matlab/GIS Single User GIS Cloud Service Field Data Service • Workshop • Details • Who: Association of Computer/Information Sciences and Engineering Departments at Minority Institutions (ADMI) faculty/students • Where: Elizabeth City State University (ECSU) • When: June 7 - July 5 2011 • What: A Teach-One-Teach-Many approach to cloud computing • Purpose • Introduce ADMI to the basics of the emerging Cloud Computing paradigm • Understand the computer systems constraints, tradeoffs, and techniques of setting up and using cloud • Understand how different algorithms can be implemented and executed on cloud frameworks • Evaluating the performance and identifying bottlenecks when mapping applications to the clouds SpatiaLite SQLite Database GeoServer Spatial Database Data Portal Spatial Database Virtual Appliance • Compute Resources • FutureGrid • Virtual machines + virtual networking to create sandboxed modules • Virtual “Grid” appliances: self-contained, pre-packaged execution environments • Group VPNs: simple management of virtual clusters by students and educators Virtual Storage Service Multiple Users (local network) KML Google Earth Example 2009 Antarctica Season Google Earth Cloud GIS Distribution Service • SpatiaLite Database • Spatial extension to manages both vector and raster data and supports a rich set of GIS analysis functions through SQL. • The data can be directly accessed through GIS software and MATLAB • SpatiaLite Database Example • 2009 Antarctic flight path data • ~ 4 million entries - originally stored as 828 separate files and imported into one SpatiaLite database file Overview of 2009 Flight Paths Data Access for Single Frame • CReSIS Field Data Accessibility • Current CReSIS Data Organization • CReSIS’s data products website lists • direct download links for individual files • The data are organized by season • Seasons are broken into data segments • Data segments are arranged into frames • Associated data for each frame are stored in different file formats • CSV (flight path) • MAT (depth sounder data) • PDFs (image products) • File-based data system has no spatial data access support • Spatial Data Accessibility Project • Two main components: Cloud distribution service and special service for PolarGrid field crew. • Data is supported among multiple spatial databases. Schedule End of 1st Week Parallel Processing Now I understand Cloud Computing Programming Model T i me I i n e Functional Programming Map /Reduce 2009 Antarctica Season Vector Data Visual Crossover Analysis for Quality Control (development project) Used by End of 3rd Week Now I appreciate why Cloud Computing is important Flight path data stored as YYYYMMDD_segID_frameID.txt SQLite command to create the segs table: CREATE TABLE segs ( UTCTime Number, Thickness Number, Elevation Number, FrameID VARCHAR(12), Surface Number, Bottom Number, QualityLevel Integer) SELECT AddGeometryColumn ('segs','geometry',4326,'POINT',2) *note: geometry: 2 -> xy, (longitude, latitude), 4326 -> WGS84 coordinate system SpatiaLite: MATLAB Direct Access Mksqlite package: a MEX-DLL to access SQLite databases from MATLAB http://mksqlite.berlios.de/ Add this flag to build.m to enable SQLite to load SpatiaLite as an extension: -DSQLITE_ENABLE_LOAD_EXTENSION=1 Testing in MATLAB: dbid = mksqlite(0,'open', ‘test.sqlite' ) sql = ['SELECT load_extension(''', path_to_spatialite, ''')']; mksqlite(dbid, sql) % load extension mksqlite(dbid, 'SELECT sqlite_version()') mksqlite(dbid, 'SELECT spatialite_version()') mksqlite(dbid, 'SELECT X(geometry) as lon, Y(geometry) as lat from segs where FrameID=2009101601001'); mksqlite(dbid, 'close') CGL’s implementation Apache’s implementation Parallelized by Twister Algorithm Hadoop End of 5th Week Now I really understand Cloud Computing! References PolarGrid Data Products: https://www.cresis.ku.edu/data SpatiaLite: http://www.gaia-gis.it/spatialite/ Quantum GIS: http://www.qgis.org/