1 / 10

Water Analytics Platform on AWS

Water Analytics Platform on AWS. Team Members Srinivasan Vembuli Rikio Chiba Romeo Luka Under the Supervision Prof. Murlikrishna Viswanathan. Background. The Department of Environment, Water and Natural Resources (DEWNR) leads the management of South Australia’s most valuable resource .

ketan
Download Presentation

Water Analytics Platform on AWS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Water Analytics Platform on AWS Team Members SrinivasanVembuli Rikio Chiba Romeo Luka Under the Supervision Prof.MurlikrishnaViswanathan

  2. Background • The Department of Environment, Water and Natural Resources (DEWNR) leads the management of South Australia’s most valuable resource. • The DEWNR collects water data from various sources and disseminates this to other agencies • Data currently stored in multiple systems • Hydstra(Legacy FoxproDB) • SQL Server • Data is currently being used by the Bureau Of Meteorology (BOM) for its analytics applications and by DEWNR in Water Connect Website applications

  3. WDTF • The Water Data Transfer Format (XML) developed in 2008 is a national standard for transferring water information. • Over 240 organisations are required to give specified water information to the Bureau under the Water Regulations 2008. • BOM is using data from the current system in Water Data Transfer Format (WDTF)

  4. Existing System Architecture Data Source Storage / Application Output Other Data GIS Application Field Sensors Data Mart SQL Server WDTF Hydstra Raw Data Raw Data Raw Data Foxpro DB Analysis

  5. Problem Definition • The current architecture relies on multiple systems running on legacy software ,i.e., Hydstra (Foxpro DB) • This leads to increased costs and inefficiency in service delivery • Current architecture does not fully utilise WDTF as the universal data format standard

  6. Project Objectives • Help DEWNR to use data in WDTF format to generate analytical data similar to BOM for public consumption (Open Data: OTF is a facilitator for SA Gov.) • Develop a cloud-based ETL system to manage water data (in WDTF) from across Australia • Providing useful analytics or insights from this data using different data mining and visualization techniques. • Some examples include time series analysis of aggregated ground-water/surface-water data and real-time mapping of water data using dashboards and mapping APIs.

  7. Solution • Hosting Water data on Cloud • Establish integrated data analysis platform • Publish and utilize water data for third party organizations

  8. Architecture on AWS AWS Data Pipeline Daily task Daily task Daily task Daily task Amazon EC2 Parse WDTF files JSON Why do we need to use Redshift? Copy WDTF files Store the data to Redshift Amazon EMR Amazon Redshift Local FTP Server Amazon S3 Analysis Why do we need to use EMR?

  9. Current Project Status • Wrote 2 Perl parser programs that do the following tasks: - • 1st parser unzips Zip files to generate XML files • 2nd parser that converts XML files to JSON Zipped file -> (Unzip Parser) -> XML files -> (Convert Parser) -> JSON • Researching how to convert JSON files to tables using EMR algorithm & plug it to redshift for analytics

  10. Deliverables • Prototype of the proposed architecture • Technical Document • Project report

More Related