Z. Bowles, P. Tissot, P. Michaud, A. Sadovski, S. Duff, G. Jeffress

Composite Training Sets: Enhancing the Learning Power of Artificial Neural Networks for Water Level Forecasts Z. Bowles, P. Tissot, P. Michaud, A. Sadovski, S. Duff, G. Jeffress Texas A&M University – Corpus Christi Division of Nearshore Research

D N R http://lighthouse.tamucc.edu

Texas Coastal Ocean Observation Network (TCOON) • Started 1988 • Over 50 stations • Source of study data • Primary sponsors • General Land Office • Water Devel. Board • US Corps of Eng • Nat'l Ocean Service Morgan’s Point

Typical TCOON station • Wind Anemometer • Radio Antenna • Satellite Transmitter • Solar Panels • Data Collector • Water Level Sensor • Water Quality Sensor • Current Meter

Tides and water levels Tide: The periodic rise and fall of a body of water resulting from gravitational interactions between Sun, Moon, and Earth. Tide and Current Glossary, National Ocean Service, 2000 Water Levels: Astronomical + Meteorological forcing + Other effects

Harmonic analysis • Standard method for tide predictions • Represented by constituent cosine waves with known frequencies based on gravitational (periodic) forces • Elevation of water is modeled as h(t) = H0 +  Hc fy,c cos(act + ey,c – kc) h(t) = elevation of water at time t H0 = datum offset ac = frequency (speed) of constituent t fy,c ey,c = node factors/equilibrium args Hc = amplitude of constituent c kc = phase offset for constituent c Maximum number of constituents = 37

What we are trying to do... We know what happens in the past... …what will happen next?

Harmonic vs. actual (when it works) (coastal station) Summertime

Harmonic vs. actual (when it fails) Tropical Storm Season Tropical Storm Season (shallow bay) (deep bay) Frontal Passages Frontal Passages Summer Summer

Standard Suite Used by U.S. National Ocean Service (NOS) • Central Frequency (15cm) >= 90% • Positive Outlier Frequency(30cm) <= 1% • Negative Outlier Frequency(30cm) <= 1% • Maximum Duration of Positive Outliers (30cm) - user based • Maximum Duration of Negative Outliers (30cm) - user based

Tide performance along the Texas coast (1997-2001) RMSE=0.16 CF=70.09 RMSE=0.16 CF=71.65 RMSE=0.15 CF=74.37 RMSE=0.12 CF=82.71 RMSE=0.12 CF=81.7 RMSE=0.10 CF=89.1

Importance of the problem • Gulf Coast ports account for 52.3% of total US tonnage (1995) • 1240 ship groundings from 1986 to 1991 in Galveston Bay • Large number of barge groundings along the Texas Intracoastal Waterways • Worldwide increases in vessel draft • Galveston is the 2nd largest port in US

Artificial Neural Network (ANN) modeling • Started in the 60’s • Key innovation in the late 80’s: backpropagation learning algorithms • Number of applications has grown rapidly in the 90’s especially financial applications • Growing number of publications presenting environmental applications

ANN schematic Water Level History  (X1+b1)  (a1,ixi)  (X3+b3) Wind Squared History b1  (a3,ixi) H (t+i) b3 Water Level Forecast Tidal Forecasts  (a2,ixi)  (X2+b2) b2 Input Layer Hidden Layer Output Layer Philippe Tissot - 2000

Why ANN’s? • Modeled after human brain • Neurons compute outputs (forecasts) based on inputs, weights and biases • Able to model non-linear systems

Hypothesis… • If the human brain learns best when faced with many situations and challenges, so should an Artificial Neural Network • Therefore, create many challenging training sets to optimize learning patterns and situations

Composite Training Sets • Past models were trained on averaged yearly data sets • These models were trained on specific weather events and patterns of 30 days • The goal was to see the effects of specialized sets on learning and performance of the ANN

Artificial Neural Network setup • ANN models developed within the Matlab and Matlab NN Toolbox environment • Found simple ANNs are optimum • Use of ‘tansig’ and ‘purelin’ functions • Use of Levenberg-Marquardt training algorithm • ANN trained over fourteen 30-day sets of hourly data

Transform Functions Purelin Tansig y = x y = (ex – e-x)/(ex + e-x)

Research Location Primary Station Secondary Stations

Optimization (training) process • Used all data sets in training to find best combination of previous water levels and wind data • Ranked data set individual performance • Successively added data sets from most successful to worst to investigate performance • Changed forecast hours to assess trend

ANN Model • Primary Station: Morgan’s Point • 48 Hours of previous WL • 36 Hours of previous winds • Secondary Station: Point Bolivar • 24 Hours of previous WL • 24 Hours of previous winds

Example data set (Julian Days) 2003265 - 2003295

Training with one set (X = 15cm) Morgan’s Point

Data set ranking

Effects of increasing data sets(Morgan’s Point) NOS Standard

Performance applied to 1998 Water level (m) Hours (1998)

Close up… WL (m) Hours (1998)

Model Comparison

Forecast trend Morgan’s Point NOS Standard

Conclusions • Large difference in performance due to training sets • Increasing the number of data sets increases performance

Future Direction • Analyze environmental factors of successful training sets • Research significance of subtle differences in ANN model training • Web-based predictions

The End! • Acknowledgements: • General Land Office • Texas Water Devel. Board • US Corps of Eng • Nat'l Ocean Service • NASA Grant # NCC5-517 • Division of Nearshore Research (DNR) • http://lighthouse.tamucc.edu

Z. Bowles, P. Tissot, P. Michaud, A. Sadovski, S. Duff, G. Jeffress

Z. Bowles, P. Tissot, P. Michaud, A. Sadovski, S. Duff, G. Jeffress

Presentation Transcript

G A P

P I Z Z A

G. P. S.

G. P. S.

G. P. S.

Z A P

S p = z-(z b +h)

a l a p a g o s :

S H O P P I N G