200 likes | 209 Views
GeoDA is a framework for on-the-fly visualization of large-scale scientific geospatial data. It utilizes the Discrete Wavelet Transform and Wavelet OLAP (WOLAP) to enable fast range-aggregate query processing. The system supports visualization based on dynamic data selection, allowing users to explore and analyze geospatial data in real-time.
E N D
GeoDA On-the-fly Visualization of Scientific Geospatial Data Using Wavelets Cyrus Shahabi, Farnoush Banaei-Kashani, Kai Song
Outline Motivation and Problem Definition Our Solution: GeoDA Underlying Technology Background: Discrete Wavelet Transform WOLAP Prototype System Development Summary Future Work
USC-JPL SURP Project Cyrus Shahabi and Farnoush Banaei-Kashani Information Laboratory (InfoLab) University of Southern California (USC) Los Angeles, CA 90089 [shahabi,banaeika]@usc.edu http://infolab.usc.edu Yi Chao and Peggy Li Climate, Oceans, and Solid Earth Science Section Jet Propulsion Laboratory (JPL) Pasadena, CA 91109 [yi.chao,peggy.li]@jpl.nasa.gov http://science.jpl.nasa.gov/COSE/
Earth Science Data Visualization Without Re-scaling Range Selection With Re-scaling
Earth Science Data Visualization Range Selection Range Selection Range Re-scaling Range Re-scaling Aggregated query over latitude, longitude and/or time
Off-line vs. On-the-fly Visualization Off-line Visualization Pre-selected range (and resolution) Visualization by query pre-computation On-the-fly Visualization On-the-fly range (and resolution) selection Visualization by on-the-fly query computation to support dynamic data
Outline Motivation and Problem Definition Our Solution: GeoDA Underlying Technology Background: Discrete Wavelet Transform WOLAP Prototype System Development Summary Future Work
Discrete Wavelet Transform {1/2, 1/2} {1/2, -1/2} 75 51 0 1 =Wa â 63 12 a′ 75 75 60 90 36 66 50 50 63 75 80 75 63 75 12 75 70 63 63 60 0 75 75 1 90 63 75 75 37 52 5 63 51 67 51 63 52 -15 -15 50 50 63 51 51 50 63 50 0 63 12 -15 -15 =DWT(a) â a 80 70 60 90 37 67 50 50 75 75 52 50 5 -15 -15 0 Multi-resolution view: Compression! â * For simplification, assume {1/2, 1/2} and {1/2, -1/2} as filters instead of the Haar filters {1/2, 1/2} and {1/2, -1/2}.
Wavelets in Databases • Others’ work1: Data Compression • Reason: save space? • Implicit reason: queries deal with smaller datasets and hence faster • Problems: • Only approximate results! • Very data-dependant • Different error rates for different queries • Our work (WOLAP)2: Query Compression • Reason: fast response time • Define range-sum query as dot product of query vector and data vector • At the query time, we have the knowledge of what is important to the pending query • More opportunities: • Progressive results • Data-independent approximation 1 See Vitter-CIKM'98, Vitter-SIGMOD'99, Agrawal-CIKM'00, Garofalakis-VLDB'00 2 See Schmidt-PODS‘02, Schmidt-EDBT‘02, Jahangiri-SIGMOD’05
80 1 80 70 1 70 60 1 60 90 1 90 37 1 37 1 67 67 50 1 50 50 50 1 178.19 178.19 33.94 33.94 0 0 2 2 7.07 7.07 -21.21 -21.21 -21.21 -21.21 0 0 2.83 0 0 0 0 0 0 0 80 70 60 90 37 67 50 50 178.19 33.94 0 2 7.07 -21.21 -21.21 0 0 0 1 1 1 1 1 0 1.73 -.35 -1 .5 0 0 0 .71 WOLAP Example Wavelet* Original â a Result=504 Result=178.19*2.83=504 (Parseval Theorem) Result=304 Result=178.19*1.73+33.94*(-.35)+2*.5 (Parseval Theorem) =304 ~303 (99% accuracy!) O(log N) << O(N) * Here we assume the actual Haar filter: {1/2, 1/2} and {1/2, -1/2}
1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 0 -.7 0.4 2.1 2.5 3.3 -.3 -.7 0.4 0 0 0 0.7 -1 0 0 0.5 0 0 0 0 WOLAP Query Complexity: O(log n) 0 0 0 0.7 0 0 0 0 0 1.4 1.4 1.4 1.4 1.4 1.4 0.7 1.0 2.0 2.0 1.5 -1 0 0 0.5 -.3 3.3 Assuming that the query is of size N: • Theorem 1: Using “lazy wavelet transform” (computing only on the boundaries of the selected range), one can transform any polynomial range-aggregate queryin O(log N) to wavelet domain. • Theorem 2: The query has O(log N) non-zero values in wavelet domain.
Related Work Agrawal-SIGMOD'97 Abbadi-ICDE'99 Abbadi-Dawak'00 N=domain size for each dimension d=number ofdimensions
Outline Motivation and Problem Definition Our Solution: GeoDA Underlying Technology Background: Discrete Wavelet Transform WOLAP Prototype System Development Summary Future Work
GeoDA Architecture Google Map Mashup Presentation Tier WOLAP Query Engine (ProDA) Plotting Tools Query Tier Wavelet Datacubes Data Tier Text Files NC Files
Helena Data Helene Dataset 10+ dimensions (selected longitude and latitude) 100+ Variables (selected SST) 1km by 1km resolution, daily samples, world-wide 36000 18000 data points per sample (~1/3 of which are null) Helene Datacube Dimensions: Latitude, Longitude Variable: SST
Presentation Tier Implementation Cross-language development – JavaScript, C#, ASP.NET AJAX Multi-thread programming Progressive Visualization GeoDA
Outline Motivation and Problem Definition Our Solution: GeoDA Underlying Technology Background: Discrete Wavelet Transform WOLAP Prototype System Development Summary Future Work
Summary We devised a framework for on-the-fly visualization of large-scale scientific datasets. We designed and exploited a fast range-aggregate query processing technique, WOLAP, that enables on-the-fly visualization. WOLAP supports the family of polynomial range-aggregate queries. We developed a prototype system, GeoDA, as a proof-of-concept based on the designed visualization framework and query processing technique.
Future Work Supporting dynamic datasets by extending WOLAP to handle append of the data stream in wavelet domain. Enhancing WOLAP via caching, to enable group/batch aggregate queries.