120 likes | 249 Views
NEAR REAL TIME VISUALIZATION OF USGS INSTANTANEOUS DATA: INTEGRATION OF OPEN SOURCE DATA TURBINE IN CUAHSI HIS. Thomas Whitenack David Ryan, David Valentine, Ilya Zaslavsky, Matt Rodriguez. USGS Instantaneous water data services. 15 minute intervals 10,000+ sites (7,000+ hav e dischage )
E N D
NEAR REAL TIME VISUALIZATION OF USGS INSTANTANEOUS DATA: INTEGRATION OF OPEN SOURCE DATA TURBINE IN CUAHSI HIS Thomas Whitenack David Ryan, David Valentine, Ilya Zaslavsky, Matt Rodriguez
USGS Instantaneous water data services • 15 minute intervals • 10,000+ sites (7,000+ have dischage) • Upto 60 days of data available • http://waterservices.usgs.gov/WOF/InstantaneousValues • Data provided using CUAHSI WaterML
Open Source Data Turbine (Ring Buffered Network Bus) • DataTurbine is a robust open-source streaming data middleware system, designed for sensor based systems. • Co-developed by our UCSD / Calit2 colleagues. • Solution for accessing both streaming and static data, from different vendor systems, via a common interface. • Released under Apache 2.0 Open Source License • Provides real high performance data streaming, 10+MB/sec, 1000 frames/sec
Open Source DataTurbine • Supported by NASA SBIR, 15 years in development • Supports multiple types of streams: real-time monitoring, video and multimedia, telemetry, instant messages, etc. etc. • Scalable: DataTurbine servers can be interconnected to handle large streams • Can manipulate the streams: fast forward or slow motion playback (TiVo-like)
Goal of Integrating Data Turbine with CUAHSI HIS • Get the two systems to work together. • Maintain an up-to-date view of a large volume of near real time data, in house. • Store data locally beyond the 60 days it is made available. • Enable viewing of the NWIS Instantaneous data in the Realtime Data Viewer (RDV).
Challenges of Project • Integrate CUAHSI HIS with the data turbine • CUAHIS HIS perspective: • Consuming waterML from Java environment • Obtain and store NWIS 15 minute data beyond 60 days. • Data Turbine Perspective • Cuahsi data represented unusual challenges • Pulling data. • Time stamps have to set for each value. • 7,000 “Channels” needed to be organized for the RDV client • Visualizing / navigating mass volumes of data.
OSDT Custom Source • Each source is a separate connection • 7000 sources was too many for OSDT. • Sources can have multiple channels and sub-channels • Sites were organized by state and county to make it navigatible • 50GB Disk cache: ~ 1 year of 15 minute data for 7000 sites. • Cycling through 7,000+ getValues request takes ~18 hours for the iteration, or upon restart. • Subsequent iterations still can complete in under 8 hours.
OSDT Custom “Sink” • Is essentially a custom client connection to DataTurbine (RDV is a sink process). • Pulls data and writes it to SQL batch files for batch inserts. • Used to update local ODM instance of NWIS instantaneous data.
Conclusions • CUAHSI HIS WaterML can be used in Java/ non windows environments successfully. • Displaying near realtimedata in RDV is very fast and is a valuable visualization tool. • Data turbine is designed to ingest much more data than this. • Capable of 10MB/Second – We’re feeding it < 1K/second. • Updating 7000+ data channels worked, but is well beyond what the OSDT developers had in mind when designing it. • Organizing 7000+ channels in a viewer display represents organizational challenges.
Questions? • twhitenack@sdsc.edu • http://www.dataturbine.org