170 likes | 309 Views
Modeling Evolution in Spatial Datasets. Paul Amalaman 2/17/2012. Data Mining and Machine Learning Lab Team Members. Dr Eick Christoph Nouhad Rizk Zechun Cao Sujing Wang. Anirup Dutta Swati Goyal Tarikul Islam Paul Amalaman. I- Background II- Research Goals III- Case Study
E N D
Modeling Evolution in Spatial Datasets Paul Amalaman 2/17/2012 Data Mining and Machine Learning Lab Team Members DrEickChristoph NouhadRizk ZechunCao Sujing Wang AnirupDutta Swati Goyal TarikulIslam Paul Amalaman
I- Background II- Research Goals III- Case Study IV- Summary
I-Background Machine Learning Techniques are mostly used where • modeling implicit trends is possible (Regression) • stable patterns exist in dataset (Classification) Simulation Systems are used when • a model is hard to establish • there is a great degree of randomness in the attribute values • there are a lot of interactions between objects • when attributes have to be predicted recursively over many steps Example Applications of Simulation Systems: Traffic Modeling, Weather Forecasting, Social Networks, Urban Modeling
I-Background continued(3)Spatial Simulation Systems ABM Cellular Automata (CA) (Cell centered approach) Continuous Agent Space Or Multi Agent System (MAS) (Agent centered approach)
I-Background continued(3) Modeling with Cellular Automata • Concept of neighborhood • Moore Neighborhood • Von Newman neighborhood Von Newman Neighborhood http://en.wikipedia.org/wiki/Von_Neumann_neighborhood Moore Neighborhood http://en.wikipedia.org/wiki/Moore_neighborhood
I-Background continued(4) Modeling with Cellular Automata Cellular Automata • provides the programmer a cell-centered programming style where the set of cells represents computing units that are regularly organized • good efficiency with parallel architecture
II-Research Goals Using Data Mining and Machine Learning Techniques to Enhance Simulation Systems New approach= Machine Learning Techniques + Spatial Simulation Systems Goal1: Grid-based Models for Progression in Spatial Datasets Goal2: Development of Cluster-based Bias Removal Methods
II-Research Goal continued (1)Goal1:Grid-based Models for Progression in Spatial Datasets ? t t +1 yi,j,t+1= fij(x1,1,1,t,…, x1,n,n,t,… , xm,1,1,t,…, xm,n,n,t, y1,1,t,…,y,n,n,t) Given that at t we know all the attribute values including the output variable Y, can we predict all attribute values at t+1? X1(t+Δt)=? X2(t+Δt)=? . . Xn(t+Δt)=? Y(t+Δt)=? X1(t) X2(t) . . Xn(t) Y(t) Challenges: 1. Many target variables to predict; different variables have to be predicted at different location 2. Target variables are not independent of each other (e.g. some are auto-correlated) 3. Models has to be used over multiple steps
II-Research Goal continued (2)Goal2:Development of Cluster-based Bias Removal Methods EPA prediction models are meteorological and chemical transport models. Those models are derived from solving differential equations. Over time, the model bias grows larger http://www.epa.gov/AMD/CMAQ/ch06.pdf Output + bias b(x) Model Input x Output Correction (bias removal) Whether pattern recognition group(x) Output h(b(x), group(x)) Input x b(x) Model Bias removal based on whether pattern recognition Our model, model hlearn group(x), and b(x) and make better prediction
III-Case Study Improving Ozone Forecasting For Houston-Galveston Area Goal1: Development of a Grid-based Prediction Framework Goal2: Development of Cluster-based Bias Removal Methods In Collaboration with UH-IMAQS Institute for Multidimensional Air Quality Studies (UH Department of Earth and Atmospheric Science) -DrRappenglueck, Bernhard -DrLi, Xiangshang
III-Case Study Continued(1)Ozone Prediction Goal 1:Improving Prediction for Spatial Progression Given what happened at t, can we predict what happens at t+Δ, t+2Δ, ..?
III-Case Study Continued(2)Ozone Prediction Goal 2- Improving forecast Accuracy
III-Case Study Continued(2) Status of Dissertation • Methods to collect ozone data and to capture it in a relational database have been developed. • The necessary knowledge for simulation-based prediction systems in general, and ozone prediction in particular has been obtained • Started work on different modeling approaches for grid-based prediction