330 likes | 430 Views
Composite Training Sets: Enhancing the Learning Power of Artificial Neural Networks for Water Level Forecasts. Z. Bowles, P. Tissot, P. Michaud, A. Sadovski, S. Duff, G. Jeffress Texas A&M University – Corpus Christi Division of Nearshore Research. D N R. http://lighthouse.tamucc.edu.
E N D
Composite Training Sets: Enhancing the Learning Power of Artificial Neural Networks for Water Level Forecasts Z. Bowles, P. Tissot, P. Michaud, A. Sadovski, S. Duff, G. Jeffress Texas A&M University – Corpus Christi Division of Nearshore Research
D N R http://lighthouse.tamucc.edu
Texas Coastal Ocean Observation Network (TCOON) • Started 1988 • Over 50 stations • Source of study data • Primary sponsors • General Land Office • Water Devel. Board • US Corps of Eng • Nat'l Ocean Service Morgan’s Point
Typical TCOON station • Wind Anemometer • Radio Antenna • Satellite Transmitter • Solar Panels • Data Collector • Water Level Sensor • Water Quality Sensor • Current Meter
Tides and water levels Tide: The periodic rise and fall of a body of water resulting from gravitational interactions between Sun, Moon, and Earth. Tide and Current Glossary, National Ocean Service, 2000 Water Levels: Astronomical + Meteorological forcing + Other effects
Harmonic analysis • Standard method for tide predictions • Represented by constituent cosine waves with known frequencies based on gravitational (periodic) forces • Elevation of water is modeled as h(t) = H0 + Hc fy,c cos(act + ey,c – kc) h(t) = elevation of water at time t H0 = datum offset ac = frequency (speed) of constituent t fy,c ey,c = node factors/equilibrium args Hc = amplitude of constituent c kc = phase offset for constituent c Maximum number of constituents = 37
What we are trying to do... We know what happens in the past... …what will happen next?
Harmonic vs. actual (when it works) (coastal station) Summertime
Harmonic vs. actual (when it fails) Tropical Storm Season Tropical Storm Season (shallow bay) (deep bay) Frontal Passages Frontal Passages Summer Summer
Standard Suite Used by U.S. National Ocean Service (NOS) • Central Frequency (15cm) >= 90% • Positive Outlier Frequency(30cm) <= 1% • Negative Outlier Frequency(30cm) <= 1% • Maximum Duration of Positive Outliers (30cm) - user based • Maximum Duration of Negative Outliers (30cm) - user based
Tide performance along the Texas coast (1997-2001) RMSE=0.16 CF=70.09 RMSE=0.16 CF=71.65 RMSE=0.15 CF=74.37 RMSE=0.12 CF=82.71 RMSE=0.12 CF=81.7 RMSE=0.10 CF=89.1
Importance of the problem • Gulf Coast ports account for 52.3% of total US tonnage (1995) • 1240 ship groundings from 1986 to 1991 in Galveston Bay • Large number of barge groundings along the Texas Intracoastal Waterways • Worldwide increases in vessel draft • Galveston is the 2nd largest port in US
Artificial Neural Network (ANN) modeling • Started in the 60’s • Key innovation in the late 80’s: backpropagation learning algorithms • Number of applications has grown rapidly in the 90’s especially financial applications • Growing number of publications presenting environmental applications
ANN schematic Water Level History (X1+b1) (a1,ixi) (X3+b3) Wind Squared History b1 (a3,ixi) H (t+i) b3 Water Level Forecast Tidal Forecasts (a2,ixi) (X2+b2) b2 Input Layer Hidden Layer Output Layer Philippe Tissot - 2000
Why ANN’s? • Modeled after human brain • Neurons compute outputs (forecasts) based on inputs, weights and biases • Able to model non-linear systems
Hypothesis… • If the human brain learns best when faced with many situations and challenges, so should an Artificial Neural Network • Therefore, create many challenging training sets to optimize learning patterns and situations
Composite Training Sets • Past models were trained on averaged yearly data sets • These models were trained on specific weather events and patterns of 30 days • The goal was to see the effects of specialized sets on learning and performance of the ANN
Artificial Neural Network setup • ANN models developed within the Matlab and Matlab NN Toolbox environment • Found simple ANNs are optimum • Use of ‘tansig’ and ‘purelin’ functions • Use of Levenberg-Marquardt training algorithm • ANN trained over fourteen 30-day sets of hourly data
Transform Functions Purelin Tansig y = x y = (ex – e-x)/(ex + e-x)
Research Location Primary Station Secondary Stations
Optimization (training) process • Used all data sets in training to find best combination of previous water levels and wind data • Ranked data set individual performance • Successively added data sets from most successful to worst to investigate performance • Changed forecast hours to assess trend
ANN Model • Primary Station: Morgan’s Point • 48 Hours of previous WL • 36 Hours of previous winds • Secondary Station: Point Bolivar • 24 Hours of previous WL • 24 Hours of previous winds
Example data set (Julian Days) 2003265 - 2003295
Training with one set (X = 15cm) Morgan’s Point
Effects of increasing data sets(Morgan’s Point) NOS Standard
Performance applied to 1998 Water level (m) Hours (1998)
Close up… WL (m) Hours (1998)
Forecast trend Morgan’s Point NOS Standard
Conclusions • Large difference in performance due to training sets • Increasing the number of data sets increases performance
Future Direction • Analyze environmental factors of successful training sets • Research significance of subtle differences in ANN model training • Web-based predictions
The End! • Acknowledgements: • General Land Office • Texas Water Devel. Board • US Corps of Eng • Nat'l Ocean Service • NASA Grant # NCC5-517 • Division of Nearshore Research (DNR) • http://lighthouse.tamucc.edu