240 likes | 450 Views
Sharing the knowledge behind the network. Network Troubleshooting Methods, Tools and Techniques. Presented by Scott Hogg shogg@lucent.com Rick Blum rickblum@lucent.com. Lucent Enterprise Professional Services. Network consulting services for enterprises
E N D
Sharing the knowledge behind the network Network Troubleshooting Methods, Tools and Techniques Presented by Scott Hogg shogg@lucent.com Rick Blum rickblum@lucent.com
Lucent Enterprise Professional Services • Network consulting services for enterprises • Premier provider of multivendor, data networking services • Software solutions to manage optimize network performance • 15,000+ engagements over 10+ years • Spin off in early 2002 • Seminar objectives • Explore the strengths of various troubleshooting methods and tools • Provide advice for troubleshooting Ethernet and routers
data data AH data PH data SH data TH data NH data Data Unit (I field) FC F A C F Troubleshoot with OSI Model in Mind Outgoing Frame Construction Incoming Frame Reduction app "X" app “X" Application 7 Application 6 Presentation Presentation 5 Session Session Transport 4 Transport 3 Network Network Data link 2 Data link Physical Physical 1 Transmitted Frame
Troubleshooting Tools • Network Management Systems (NMS) • Element Management Systems (EMS) • Vendor-specific • Network application aware monitoring/ trending - proactive • FCAPS model • Logs, event correlation, time of day • Time synchronized logs (NTP) • Server stats, CPU, memory, disk, I/O
Troubleshooting Tools • Upper OSI Layers • Protocol analyzers - Type of protocols • Middle OSI Layers • SNMP, RMON probes, remote probes • Protocol analyzers • Lower OSI Layers • Wire scopes, cable testers, optical power level meters, OTDRs, BERT • Voice convergence • Voice quality tester • Tools for testing echo, delay, jitter, etc.
Troubleshooting Impact • In-Band techniques • Ping – ICMP echo request/reply • Traceroute – TTL and High UDP Port • Telnet – TCP port 23 connection • Mock/synthetic transactions • Loopback interfaces - BERT • Protocol analysis • Out-of-Band techniques • Isolates management network from production network • Can’t troubleshoot a network with problems and congestion with the network itself • Access to manage equipment is critical
Ethernet Cabling Problems • Troubleshooting fiber cables • Multi-mode vs. single-mode • SC connectors are popular • Dirt and Grime - polishing • Re-terminate fiber if necessary • Make sure GBIC matches • 62.5-um vs. 50-um vs. 10-um (SM) fiber cables • Cat 5 certification and testing • Pairs, impedance, etc. • Cable length • Bends and kinks • 5-4-3 maximum topology rule of repeaters
Media Layer Ethernet Problems • Collisions (remote, local, late) • Runts, giants, jabbers, long • FCS errors • Alignment errors • Frame misalignment errors • Capturing these bad frames • Software-based protocol analyzers • Hardware-based protocol analyzers
Troubleshooting Ethernet • No link light!? • Auto-Negotiation (Speed/Duplex) • Packet capture effect – wire hogging • Faster system collides, then chooses lower back-off time • Increased secondary collisions • Phantom MAC addresses • 55555555 or AAAAAAAA from jam bits • 01010101 = 55 or 10101010 = AA (10Mbps) • 10110000 = D0 or 01010011 = 53 (100Mbps)
Troubleshooting Ethernet • Hubs (multiport repeater) • Device is collision domain • Switches (multiport bridge) • Each port is own collision domain • Ethernet switches don’t forward bad frames • Integrated Layer 2-3 and Layer 4-7 devices • View Ethernet switch’s MAC address table • View spanning tree (IEEE 802.1d) parameters • Check ISL or 802.1q trunking • Look at switch port stats • Traffic load, error rates, # of broadcasts and multicasts, # of discards
“Sniffing” Ethernet • Capturing Ethernet frames • Port mirroring, port aliasing, SPAN, tap, pass-through analyzer • Capture traffic flowing in both directions • Put analyzer “in-line” • Use a hub – temporarily • Port-level security can make this difficult • VLAN decode is helpful (ISL or 802.1q) • Network-based intrusion detection competition
Troubleshooting Routers – Checking the Basics • Check Layers 1 & 2 • View interface statistics/status • Clear counters, then watch them increment • Check Layer 2 to Layer 3 mappings – ARP caches • Ping/traceroute in both directions • Check last router to reply to traceroute • Try extended ping options • Make sure hosts configured properly • Check addresses, numbers, and name resolution • Operational techniques • Router logs and SNMP traps • Use router debugging • Change/Configuration management
Routing Table Problems • Routing protocol table and forwarding table • “Show ip route” vs. “show ip ospf route” • Inactive or flapping routes • Check routing table and routing metrics for specific routes – check in both directions • Clear out specific route or entire routing table and let it build again – last resort • Check route summarization and redistribution • Administrative distance (believability/favorability) • Equal-cost load balancing • Asymmetrical routing • Open jaw routes • Black hole routes
Distance-Vector Protocols • Hop-by-hop updates get lost • Check RIPv1 and RIPv2 compatibility • Problems caused by summarization • Automatic redistribution into other routing protocols • Discontiguous subnet mask problems • Is split-horizon enabled? • Convergence times may be longer than ever thought possible (~10-15 minutes)
Link-State Protocols • Link-state metrics • Neighbor adjacencies • Understand state table for protocol • Timers should be same on both neighbors • Router authentication • Check redistribution with distance-vector protocols • Summarization and discontiguous networks • View topology/route database • What routes made it into forwarding table? • Think about storing copies of the routing tables in a file for future reference – anomaly checking • Save the router’s state – on any “normal day” when the network is fully converged
Path-Vector Protocols (BGP-4) • Peering takes place on TCP port 179 • Neighbors should be in “established” state • Reset peer – soft reconfiguration • BGP – exchanges just hello’s after initial peering • Synchronization with IGP • Transit AS - use synchronization • Check BGP table & decision algorithm • View routes in BGP table to see which ones make it into forwarding table • Know BGP’s attributes – well-known, mandatory, optional, transitive, non-transitive • Route flap dampening
The Bottom Line • Use good methodology • Document actions and results • Leverage all tools to gather information • Use protocol analyzer to help troubleshoot problems • Understand protocols you are troubleshooting
Be a Super Sleuth • Sherlock Holmes, Quincy, Colombo, Matlock, Hercule Poirot, Miss Marple • Start at the scene of the crime • Find evidence, pay attention to details • Follow leads even if they’re weak • Dig deeper, use forensic techniques • Never give up the Quest!
Lucent Professional Consulting Services for Enterprises • IP Data Networking • Network Management • Performance Engineering • Network Security • Microsoft Technologies • Business Consulting
Glossary • OSI – Open System Interconnection • OSPF – Open Shortest Path First • OTDR - Optical Time Domain Reflectometer • RIP – Routing Information Protocol • RMON - Remote MONitoring extensions to SNMP • SC connector – Fiber optic cable connector • SM - Single mode • SNMP – Simple Network Management Protocol • SPAN – Switch Port Analyzer • TCP – Transmission Control Protocol • TTL – Time To Live • UDP – User Datagram Protocol • VLAN – Virtual Local Area Network • ARP – Address Resolution Protocol • AS - Autonomous System • BERT – Bit Error Rate Test • BGP – Border Gateway Protocol • Cat – Category • FCAPS – Fault, Configuration, Availability, Performance, Security • FCS – Frame Check Sequence • GBIC – GigaBit Interface Converter • ICMP – Internet Control Message Protocol • IGP – Interior Gateway Protocol • ISL – Inter-Switch Link • IP – Internet Protocol • MAC – Media Access Control • NTP – Network Time Protocol
Resources • Knowledge Web Seminar • Network Fault Management Tools and Techniques http://www.lucent.com/knowledge/documentdetail/0,1494,inContentId+12680-inLocaleId+1,00.html • Other • Telecommunications Management Network Roadmap http://www.itu.int/TMN/ • Ethernet Web sites http://www.faqs.org/faqs/LANs/ethernet-faq/ http://wwwhost.ots.utexas.edu/ethernet/ http://www.alumni.caltech.edu/~dank/fe/ http://www.cavebear.com/CaveBear/Ethernet/ http://grouper.ieee.org/groups/802/3/index.html http://www.10gea.org • General Network Troubleshooting Web site http://www.networktroubleshooting.com/