60 likes | 201 Views
Routing problems are easy to cause, and hard to diagnose (“Happy operators make happy packets”). Jennifer Rexford AT&T Labs—Research http://www.research.att.com/~jrex. How Can You Sleep at Night When…. A single typo can bring down your network Someone else ’s typo can bring you down
E N D
Routing problems are easy to cause, and hard to diagnose(“Happy operators make happy packets”) Jennifer Rexford AT&T Labs—Research http://www.research.att.com/~jrex
How Can You Sleep at Night When… • A single typo can bring down your network • Someone else’s typo can bring you down • Changing config makes your heart skip a beat • ... ‘cause you can’t tell what’s might happen • … and whether it’s a career-limiting move • The routing system discards your packets • And you can’t even figure out why • Or who’s fault it is • Or how to fix it • Or if it might just go away on its own
Configuring Routing Protocols is Hard • Primitive configuration languages • Thousands of assembly-language commands • Many protocols and tunable options • Weights, areas, timers, filters, policies, … • Subtle interactions between protocols • Hot potato, route injection, routing control traffic, … • Complex techniques for achieving scalability • Route reflectors, route aggregation, summarization, … • Network configured at the element level • Configuring individual boxes not entire network • Indirect ways of achieving operations goals • E.g., TE by tweaking IGP weights and BGP policies
Troubleshooting Routing Problems is Hard • Problems can arise from outside of your network • E.g., bogus advertisements, weird filtering, etc. • Route filtering and aggregation are tricky • … leading to black holes, forwarding loops,… • We don’t know the Internet topology • … and perhaps we’ll never, ever know (sigh) • We don’t have good tools for probing the paths • E.g., traceroute has many known limitations • Routing protocols aren’t all that chatty • … they don’t say why a router changed his mind • The routing isn’t always the system to blame • E.g., MTU mismatch, packet filters, congestion
Fixing the Problem? • Better router configuration languages • Higher level of abstraction, vendor independent • Joining data together to aid detection • Multiple vantage points, multiple data types • Good anomaly-detection algorithms • Based on good underlying models of routing • Better router support for routing measurement • Forwarding path, routing protocol messages, etc. • Distributed platform for debugging problems • Partial diagnosis with scalability and information hiding • Routing protocol extensions (and “do overs”) • Design for diagnosability, and verifiability
My Position: This is Really Pathetic! • Two problems needing attention • Configuring the routing protocols • Debugging the routing problems • This moves us beyond • Characterizing lots of measurement data • Bottom-up solutions to various problems • … toward the holy grail of • Greater abstraction of the network design • Routing protocol design for managability • A well-behaved communication infrastructure