90 likes | 228 Views
Chartis Path Inference in Data Center Networks. Aman Shaikh ashaikh@research.att.com Kyriaki Levanti Vijay Gopalakrishnan Hyong S. Kim Seungjoon Lee Emmanuil Mavrogiorgis CNSM 2012 (October 23). Joe the Operator*. Joe runs a big complicated data center network Customer complains:
E N D
ChartisPath Inference in Data Center Networks • Aman Shaikh • ashaikh@research.att.com • KyriakiLevanti • Vijay Gopalakrishnan • Hyong S. Kim • Seungjoon Lee • EmmanuilMavrogiorgis • CNSM 2012 (October 23)
Joe the Operator* Joe runs a big complicated data center network Customer complains: • “Performance of a service was patchy over the weekend” Joe’s trouble-shooting steps: • Any problem with the servers? • No • Any problem along the path from customer to servers? • Hmm…what was the path??? * Resemblance to any person (dead or alive) is coincidental
Determining Path Traceroute • No traceroute data was collected at the time of problem • Running traceroute on all possible paths prohibitively expensive Netflow • Data center collects Netflow on an on-going basis • Customer’s traffic was not sampled Configuration + routing protocol messages Joe has access to router configurations and archives of messages exchanged between routers
Why Configuration Data? Routing messages: • Exchanged by routers to decide how traffic should flow through a network • Capturing these messages allow path determination • And how paths changed in response to network dynamics Configuration: • Routing rules are often coded in the configuration files (only) • Example: static routes, policy-based routing Since Joe’s data center uses plenty of configuration-based tricks for routing, Joe starts analyzing configuration files
Challenges with Configuration Analysis Scale and heterogeneity of devices Multiple layers involved in routing • Layer-2, layer-3and overlays Virtualization • Example: VRFs and VLANs Middle-boxes • Example: NATs Joe realizes it’s going to be very time-consuming and error-prone to manually analyze configurations
Chartis Cross-layer path inference system User Input Layer-3 Routing Engine Layer-3 hops source router Merge layer-2/layer-3 info destination Path time Layer-2 Routing Engine Per-VLAN spanning tree Layer-2 Topology Router Configurations
Verification Four data centers of a US Cellular provider • Determined paths for 4,000 packets captured from data centers in February 2011 • Ground truth: • Forwarding tables collected from subset of devices on the path • Network operators (for a few sample paths) Campus network • Determined paths to virtual machines running in specific locations in the network • Ground truth: • Traceroute and Netflow
Chartis in Real Life • Used in data centers of the US Cellular provider • Component of a system used for visualizing end-to-end paths from cell tower to the Internet • Case Study • Increased latency for traffic going through a particular SGSN • Trouble-shooting steps: • Did SGSN start using a different GGSN? • NO • Did SGSN to GGSN path change? • YES • An interface was shutdown on a router along the (old) path Internet SGSN GGSN
Summary Chartis: path inference system for data centers • Fundamental building block for several network and service management tasks Used in data centers of a major US Cellular provider • Component of a system used for visualizing end-to-end paths from cell tower to the Internet Expanded version of the paper available at http://www2.research.att.com/~ashaikh/papers/chartis-tech-report12.pdf Joe is living happily ever since he started using Chartis