280 likes | 396 Views
Metrics and Monitoring Capabilities for Earth Science Data Systems. ESDSWG Wilmington, Delaware October 20-22, 2009. Outline. Core and Community Capabilities Network Flow Requirements and Monitoring Science Data Production and Distribution Transitioning from Community to Core.
E N D
Metrics and Monitoring Capabilities for Earth Science Data Systems ESDSWG Wilmington, Delaware October 20-22, 2009
Outline • Core and Community Capabilities • Network Flow Requirements and Monitoring • Science Data Production and Distribution • Transitioning from Community to Core
Core and Community • Core • Data system elements needed to ensure processing, archival and distribution of data collected by EOS designated Earth science missions in a timely and usable manner • Community • Data system elements developed and deployed largely outside the NASA core elements characterized by ‘evolvability’ and innovation with the potential to be integrated into the core
Metric and Monitoring Capability Comparison * Limited Capabilities. Sometimes core/community overlap.
Network Flow Requirements • Science data flow requirements are formulated to support nominal and reprocessing efforts. • Science teams, processing and archival facilities, among others’, are responsible for requirement development. • Requirements are stored in documents under configuration control. • Requirements are constantly evaluated.
Network Monitoring • Includes both passive and active network monitoring tools that gather network statistics, populate the information into a database, and provides tools to analyze the data to: • Assist in troubleshooting performance problems • Track utilization of network resources • Verify requirements against actual • Help forecast required upgrades
Passive Monitoring System Overview Data Center Limited access EBnet Engineering hosts Data Center EBNet Data Center ENSIGHT DB Collector Data Center Secured ENSIGHT Web Server Proxy Web Server Passive performance data (utilization, CPU pct., NetFlow, etc.) provided to collector Passive monitoring information (graphics, statistics, HTML) pushed to secured web server Proxied Performance web-site (HTML) Interactive Live-monitoring, Flow graphs and reports
SNMP Object Monitoring • Similar to MRTG • Data stored and available via SQL • Web front-end permits control of collection and graphing
Custom Flow Graph • Examination of FTP transfers between two networks • One hour time period examined • Ex.: Used to troubleshoot slow FTPs, and exonerate network
Tracking NetFlow Impact on Network Resources • Network Flow load is tracked on local LAN • TCPdump data is collected, extrapolated • Graph indicates load on WAN, LAN caused by Network Flow • Rarely more than 15K bits/ second for 4 routers
ENSIGHT Active Testing Overview • End-to-end user level test • Little or no visibility into network internals • Purposes • Assess whether networks as implemented meet EOS requirements • Assess whether existing networks can support intended applications • Resolve user complaints: • Network problems -- or somewhere else?? • Determine bottlenecks -- seek routing alternatives • Provide a basis for allocation of additional resources
Active Testing: System Overview Test Source 1 Test Destination 1 o o o o o o Test Source 30 Test Destination 80 ENSIGHT Users Security Perimeter Secured ENSIGHT Web Server, and B/U database ENSIGHT Active Collectors (Primary and B/U) ENSIGHT Database (primary) End-to-end iperf (or other) active network performance measurement Active performance measurement results (throughput, packet loss, etc.) provided to collector Periodic SCP fetch of external Active performance results Performance graphics provided to secured web server Performance web-site (HTML)
Integrated Charts • The problem: Neither iperf nor MRTG alone is sufficient to characterize the performance of a circuit • MRTG will be low if users are idle • But Iperf results will appear low if competing with active user flows • Solution: Add the iperf and MRTG measurements together. • But there are some difficulties • Improved Solution: Add the iperf and applicable Flow data • Flow data can be obtained for small time periods • But still susceptible to interference
Production and Distribution Monitoring • ESDIS Metric System (EMS) is used to track core components. • MCT is used to track community projects (e.g. MEaSUREs)
EMS Overview • Automated collection, lookup, QA and reporting • Web based reporting interface Users
EMS Implementations OGPB TBD
FY08 Science Distribution *OBPG data taken from the Ocean Color web site
Distribution Trends *OBPG data taken from the Ocean Color web site
MCT Overview • Web form for capturing metrics from community projects. • Projects manually enter data into form, typically monthly. • Web-based reporting interface for Program sponsors, PIs and NASA Managers. • Community recommends modifications to metric questions annually at the ESDSWG.
MCT Transition • The Metric Collection Tool is being replaced. • Why Change • Consolidate data, support and sustaining engineering. • Ensure archival of metrics. • Single metric reporting interface. • When • Prototype now available. • Soliciting comments and testers. • Hope to have the tool available in January 2010. • Help us name it ….
Transitioning Community to Core • Some community projects datasets, tools and/or services may transition to core capabilities. • Technical approaches to handle this transition (from a metrics perspective) are being implemented through the consolidation of metric gathering tools.