220 likes | 255 Views
Explore the benefits of NetFlow in monitoring network traffic, visualize traffic patterns, and troubleshoot issues efficiently at the ESnet Site Coordinating Committee Meeting in Columbus, OH. Learn about NetFlow characteristics, solutions, data accuracy, and user interface options. Discover how to analyze top flows, troubleshoot network problems, and enhance data accuracy with NetFlow technology. See detailed examples and snapshots of NetFlow applications in action for effective network management.
E N D
NetFlow: Digging Flows Out of the Traffic Evandro de Souza ESnet ESnet Site Coordinating Committee Meeting Columbus/OH – July/2004
Outline • Motivation • Possible Approaches • What is NetFlow • Solution Design • Snapshots • Trouble-Shooting Example • Present State ESCC Meeting - Columbus/OH
Motivation • CHALLENGE • Steve Wolf challenge: “Show me all traffic exchanged between ESnet and Abilene.“ • Generalized challenge: To show ingress and egress traffic exchanged with ESnet broken down by AS. • MAIN REQUIREMENTS • ability to identify the top 100 flows involving institutions directly using ESnet • ability to identify AS-AS traffic • ability to visualize the top 10 flows and their evolution during a period of time • scalability to process data from all ESnet border routers ESCC Meeting - Columbus/OH
Solutions Available • Hardware Solutions • Dedicated Router Monitoring Board • Example: Juniper’s Monitoring Services PIC • Manufacturer dependent • Very expensive • Dedicated Link Monitoring Box • Example: BSD box using Bro • Scalability issues • Real-time information about routing tables • Software Solutions • Example: NetFlow • Adopted by several router and switch products (Cisco, Juniper, etc) • May require huge computing power to process data from large networks ESCC Meeting - Columbus/OH
NetFlow Characteristics (1) • What is a Flow? • A flow is defined as a unidirectional stream of packets. It is uniquely identified as the combination of the following seven key fields: • Source IP address • Destination IP address • Source port number • Destination port number • Layer 3 protocol type • ToS byte • Input logical interface (ifIndex) • It’s not a TCP flow. ESCC Meeting - Columbus/OH
NetFlow Characteristics (2) Packet Count Byte Count Source IP Address Destination IP Address Start sysUpTime End sysUpTime Source TCP/UDP Port Destination TCP/UDP Port Input ifIndex Output ifIndex Next Hop Address Source AS Number Destination AS Number Source Prefix Mask Destination Prefix Mask Type of Service TCP Flags Protocol NetFlow Packet Version 5 ESCC Meeting - Columbus/OH
Network Statistics System (Linux Cluster) Collectors Web Servers Computing Nodes Disk Storage Software Tools Flow-Tools (OSU) Perl MySQL Data Flow Processing Router sends Netflow Collectors scale up and store raw data Cluster performs: Intercloud filtering Aggregation Sorting Truncation (Top 100) SQL Store Display Data System Architecture ESCC Meeting - Columbus/OH
Data Accuracy • ESnet has a variety of router models from Cisco and Juniper. Both companies have different approaches to generate NetFlow information. • Cisco • Conditions for end of a flow • end of TCP connection (RST/SYN) • traffic not seen on a flow for 15 seconds • 30 minutes after the flow starts • when the flow table fills • No sampling for models lower than 12000 • Juniper • Statistical sampling per interface • We used SNMP data to compare the information obtained from NetFlow data ESCC Meeting - Columbus/OH
SNMP Comparison (Juniper) ESCC Meeting - Columbus/OH
SNMP Comparison (Cisco) ESCC Meeting - Columbus/OH
User Interface • Long Term Analysis • Use data stored in SQL database • Trend analysis • Short Term Analysis • Use raw data collected from routers • Network troubleshooting ESCC Meeting - Columbus/OH
Top Flows Screenshot - 1 ESCC Meeting - Columbus/OH
Top Flows Screenshot - 2 ESCC Meeting - Columbus/OH
Top Flows Screenshot - 3 ESCC Meeting - Columbus/OH
Top Flows Screenshot - 4 ESCC Meeting - Columbus/OH
Trouble-Shooting Example (1) • Topology GE • Hypothesis • Traffic from FNAL GE connection (FNAL CE -> FNAL-RT1) was over-running OC12 POS (FNAL-RT1 -> CHI-RT1) OC12 POS CHI-CR1 FNAL-RT1 FNAL CE • Issue • Regular egress discards on OC12 POS between FNAL-RT1 router and CHI-CR1 router. ESCC Meeting - Columbus/OH
Trouble-Shooting Example (2) • Flow Analysis • Isolate flows within discard time window • Mark time window by referencing “originating file” • Sort by “octets” field # --- ---- ---- Report Information --- --- ---## Fields: Total# Symbols: Disabled# Sorting: Descending Field 3# Name: Source/Destination IP## Args: flow-stat -f10 -S3### src IPaddr dst IPaddr flows octets packets originating file#129.105.21.229 198.49.208.10 193 1140264700 1014000 fnal-rt1.burst.2004-06-23.2120-2004-06-23.2125129.105.21.229 198.49.208.10 174 1138227500 1014600 fnal-rt1.burst.2004-06-24.0120-2004-06-24.0125198.49.208.10 129.105.21.229 196 1106719500 1114000 fnal-rt1.burst.2004-06-24.0120-2004-06-24.0125129.105.21.229 198.49.208.10 175 1086035800 980500 fnal-rt1.burst.2004-06-23.1920-2004-06-23.1925198.49.208.10 128.100.190.11 182 1085264900 980500 fnal-rt1.burst.2004-06-23.1920-2004-06-23.1925198.49.208.10 128.100.190.11 213 1062479100 960000 fnal-rt1.burst.2004-06-23.2120-2004-06-23.2125198.49.208.10 129.105.21.229 180 1051220800 1093500 fnal-rt1.burst.2004-06-23.1920-2004-06-23.1925128.100.190.11 198.49.208.10 242 1012027800 842100 fnal-rt1.burst.2004-06-23.2120-2004-06-23.2125198.49.208.10 128.100.190.11 206 1007483100 916300 fnal-rt1.burst.2004-06-24.0120-2004-06-24.0125128.100.190.11 198.49.208.10 200 1001671900 842300 fnal-rt1.burst.2004-06-23.1920-2004-06-23.1925128.100.190.11 198.49.208.10 231 989225200 817700 fnal-rt1.burst.2004-06-24.0120-2004-06-24.0125198.49.208.10 129.105.21.229 211 957567200 1050100 fnal-rt1.burst.2004-06-23.2120-2004-06-23.2125131.215.144.227 198.49.208.10 198 946292400 876500 fnal-rt1.burst.2004-06-23.2050-2004-06-23.2055131.215.144.227 198.49.208.10 209 936021800 882900 fnal-rt1.burst.2004-06-24.0850-2004-06-24.0855131.215.144.227 198.49.208.10 196 932688300 857700 fnal-rt1.burst.2004-06-24.0250-2004-06-24.0255131.215.144.227 198.49.208.10 206 904774900 848500 fnal-rt1.burst.2004-06-24.0650-2004-06-24.0655… • Verification • Reroute 198.49.208.10 (dmzmon0.deemz.net) via an alternate route ESCC Meeting - Columbus/OH
Present State of Development • Porting application to Cluster • Some problems on the OS and Disk Array • Testing Scalability of the System • Amount of disk space necessary per day to store data for all border routers • CPU and Memory necessary to process data • Other issues • Developing a Web Interface to display the stored data ESCC Meeting - Columbus/OH
Small Flows Percentage ESCC Meeting - Columbus/OH
Flow Rate ESCC Meeting - Columbus/OH