10 likes | 160 Views
Inference Engine. Mobile Bay, as seen from space Image Science and Analysis Laboratory, NASA-Johnson Space Center. "The Gateway to Astronaut Photography of Earth." <http://eol.jsc.nasa.gov/sseop/EFS/printinfo.pl?PHOTO=STS040-88-BQ>04/16/2012 15:28:04. The Sulis Framework.
E N D
Inference Engine • Mobile Bay, as seen from spaceImage Science and Analysis Laboratory, NASA-Johnson Space Center. "The Gateway to Astronaut Photography of Earth." <http://eol.jsc.nasa.gov/sseop/EFS/printinfo.pl?PHOTO=STS040-88-BQ>04/16/2012 15:28:04. • The SulisFramework • Sulis Informatics Services includes an architecture to collect, store, and present environmental data, so that resource managers can make informed decisions within their jurisdiction. A portion of these data come from sophisticated, multi-dimensional numerical models that predict time-and-space-varying hydrologic and hydrodynamic behavior. • Inference Engine for Sulis • EFDC Tests • Unfortunately, using sophisticated numerical models can require significant time and effort and, thus, the number of simulations must be limited. The Sulis Inference Engine leverages a limited number of necessary model simulations by generating pseudo-simulations so that new information can be generated quickly and easily. The Inference Engine offers three tasks: interpolation within a data set or model simulation, extrapolation from a data set or model simulation, and interpolation between multiple model simulations. • These initial tests are based on EFDC Model outputs for the Mobile Bay area in Alabama. Both of these tests predict the temperature at the bottom of Mobile Bay, based on time step and position (x, y, and z). The x and y positions for each data point are distributed irregularly across a rectangular grid, but remain the same between time steps. • All data are scaled between -1 and 1 before being used in either the training set or the test set. • ADCIRC Test • This initial test is based on ADCIRC Model outputs for hurricane Ivan, produced by Louisiana State University. This test predicts the surface elevation, based on timestep and position (x, y, and z). The x and y positions for each data point are distributed irregularly across a rectangular grid, but remain the same between timesteps. • All data are scaled between 0 and 1 before being used in either the training set or the test set. • Project Goals • While the Inference Engine is designed to avoid the steep time costs of additional model runs, it most also be sufficiently accurate to meet the needs of the Sulisproject. Initially, the project used linear regression as a lower bound on acceptable prediction accuracy. Now, however, a stricter definition of acceptable accuracy is needed, as more than one prediction method satisfies the initial requirements. As such, the current goal of the Inference Engine project is to determine whether any of the tested prediction methods are significantly more accurate than the other methods and to determine the conditions for which that accuracy remains stable. • In this test, a model file is trained on a number of consecutive time steps, with a randomly selected start position. The test file is composed of a number of time steps immediately after the training set. • Analysis Results • Both support vector regression and spline regression are more accurate than linear regression, under current conditions. Also, for extrapolation with large test set sizes, support vector regression appears to be notably more accurate than spline regression. However, additional testing is needed to more accurately and more completely determine the relative limitations of both prediction methods. • This analysis does indicate that interpolation and extrapolation using a single data set is practicable. Further testing will be done to determine whether interpolation between multiple data sets is feasible. • Support Vector Regression • Support vector regression is implemented using the LIBSVM software package. This implementation contains kernel parameters that, when optimized, increase prediction accuracy. • Additional Considerations • Spline Regression • Initial testing indicates that spline regression is roughly 65% faster than support vector regression. Also, using the optimum tuning parameters for support vector regression increases prediction accuracy by a factor of roughly 2.3. Unfortunately, calculating these tuning parameters takes a significant amount of time. • Each of these factors will be taken into account when selecting the best prediction method for the Inference Engine. • Spline regression is implemented using the earth software package, from the R programming language. This package is based on the MARSPLINES techniques created by Thomas Friedman. • In this test, the test file is composed of a number of time steps. The training set contains a certain number of consecutive time steps both immediately before and immediately after the test set. • Nate Phillips • ncp38@msstate.edu • Mississippi State University • CI Strategy 2