200 likes | 348 Views
Comparison of Gap Interpolation Methodologies for Water Level Time Series Using Perl/PDL. By Aimee Mostella, Alexey Sadovski, Scott Duff, Patrick Michaud, Philippe Tissot, Carl Steidley. Necessity of Interpolation.
E N D
Comparison of Gap Interpolation Methodologies for Water Level Time Series Using Perl/PDL By Aimee Mostella, Alexey Sadovski, Scott Duff, Patrick Michaud, Philippe Tissot, Carl Steidley
Necessity of Interpolation Gaps in time series limit the types of methods which may be used to study and further our understanding of water level patterns Texas A&M University-Corpus Christi Division of Nearshore Research (TAMUCC-DNR) and Texas Coastal Ocean Observation Network (TCOON) collect, archive and analyze various types of time series including water level time series
lrwlfill • TAMUCC-DNR and TCOON have developed lrwlfill, a Perl script designed to interpolate gaps in water level time series • Ease and power with Perl Efficient computation and data storage with the Perl Data Language Module (PDL)
Basic Algorithm • Retrieve data according to user provided parameters • Search data for missing values • Perform linear regression to obtain two sets of coefficients • Calculate missing values with coefficients Combine two sets into one Insert new values in place of missing data
Retrieving the Data • Retrieve water level values corresponding to these parameters: • Time frame • Retrieve from one month before to one month after time provided by the user • Station identifier • Number of coefficients • Method of fitting the resulting data
Water Level • RWL = AWL - HWL • RWL => Residual Water Level • AWL => Actual Water Level • HWL => Harmonic Water Level • Record the location of gaps in the AWL • Record the difference between AWL and HWL as the RWL
Linear Regression • For each gap in the data • Perform forward and backward linear regression (FLR & BLR, respectively) using hourly data to obtain coefficients • Calculate the missing data points with these coefficients
Methods of Combination • Combine the results of FLR & BLR using one of the following methods: • Convex linear combination • Based on weighted proportion • Convex trigonometric combination • Based on trigonometrically weighted proportion • Combination at intersection • Fuse together at the intersection
Testing Conditions • Full sets of existing data • Two station locations: • Embayment station • Open coast station • Three typical weather conditions • Periods of calm weather • Periods of frequent frontal passages • Extreme weather • Varying gap sizes and numbers of coefficients
Testing Standards • United States National Ocean Service (NOS) standards: • Root mean square error (RMSE) • Central Frequency (CF) • Other statistical measures: • Standard deviation (SD) • Maximum error (ME)
Results Analyzed • Timing and accuracy for varying numbers of coefficients • Best procedure to fit FLR and BLR • Timing and accuracy as gap size increased • Precision as weather conditions changed • Accuracy for embayment stations in contrast to open coast stations
Effect of Number of Coefficients • Timing was negligible • Accuracy peaked and then declined depending upon weather conditions • RMSE was used to determine the optimal number of coefficients
Coefficients Figure 1 displays our chosen coefficients according to weather condition Although these coefficients are optimal, the accuracy of interpolation still declines as weather becomes more extreme
Best Fit • Convex linear combination demonstrated the highest level of accuracy in fitting the data as shown in Figure 2.
Effects of Gap Size and Weather Conditions • Generally, timing was negligible • Precision decreases steadily as gap size increases as shown in figure 3 Precision also decreases as the weather becomes more extreme
Embayment vs. Open Coast • Overall, data from embayment stations produced better results than data from stations along the open coast (figure 4) The flow of water into the bay dampens the change in water level
Future Direction • Convert lrwlfill to a real-time web based implementation • Experiment with the amount and orientation of previous data used to calculate coefficients • Study the results of using more frequent time series values in the linear regression process as weather becomes more erratic to make up for rapid changes in weather
Bibliography High Performance Computing Development Center, Texas A&M University-Corpus Christi. http://www.sci.tamucc.edu/~hpcdc/ Mostella, Aimee; Duff, Scott; Michaud, Patrick R. (2001) HARMAN and HARMPRED: Web-based Software to Analyze Tidal Constituents and Tidal Forecasts for the Texas Coast NOAA, 1994: NOAA Technical Memorandum NOS OES 8. National Oceanic and Atmospheric Administration, Silver Spring, Marilyn. Sadovski, Alexey L.; Michaud, Patrick R.; Steidley, Carl; Tishmack, Jessica; Torres, Kelly & Mostella, Aimee L. (2003). Integration of Statistics and Harmonic Analysis to Predict Water Levels in Estuaries and Shallow Waters of the Gulf of Mexico. Presentation at the MATA International Conference (Cancun, Mexico), April 2003. Sadovski, Alexey L.; Tissot, Philippe; Michaud, Patrick; Steidley, Carl (2004) Statistical and Neural Network Modeling and Predictions of Tides in the Shallow Waters of the Gulf of Mexico. In “WSEAS Transactions on Systems”, Issue 2, vol. 2, WSEAS Press, pp. 301-307.