310 likes | 320 Views
NSIS – an overview and its interaction with Route Change. Why signaling?. NSIS stands for Next Step in Signaling Signaling is about manipulating state in network nodes State Management examples: QoS resource allocations NAT and firewall configuration state used in active networking
E N D
NSIS – an overview and its interaction with Route Change IRT Group 06/10/04
Why signaling? • NSIS stands for Next Step in Signaling • Signaling is about manipulating state in network nodes • State Management examples: • QoS resource allocations • NAT and firewall configuration • state used in active networking • State Monitoring examples: • instantaneous path property discovery (available bandwidth, queuing delay etc.) • Need for a general purpose signaling protocol IRT Group 06/10/04
How about RSVP? • Original Resource reSerVation Protocol: • Standardized in 1997 • designed for en-to-end resource reservations based on the Integrated Service architecture. • RSVP extensions: • RSVP refresh reduction extensions • RSVP operation over IP tunnels • RSVP-Traffic Engineering over MPLS networks • RSVP-Aggregation • RSVP for NAT and Firewall traversal • RSVP mobility IRT Group 06/10/04
Why not RSVP? • Not designed to be a generic signaling protocol from the first place • Unclear whether independently designed RSVP extensions would collaborate in a single implementation • There are also criticisms for using RSVP in QoS - the built-in multicast support adds considerable overhead, but very often may be unnecessary IRT Group 06/10/04
Here Comes NSIS … • Working group chartered in Nov. 2001 to investigate the architecture and protocol design of the next (generic) signaling protocol for the Internet based on existing experiences • Creating requirements, frameworks, and specifications for a signaling protocol envisioned to support various signaling applications that need to install and/or manipulate state in the network IRT Group 06/10/04
Current NSIS Assumptions - Path-coupled and Unicast • Signaling messages are routed, flow state is installed and maintained, only through NSIS Entities (NEs) that are in the data path • Not every node along the path has to be NEs. There could be proxies distinct from the sender and receiver, or intermediate signaling-unaware nodes • Only unicast flows are considered IRT Group 06/10/04
NSIS Two-layered Model • Several aspects of protocol support to exchange messages along the data path are common to all or a large number of applications, and hence should be developed as a common protocol • a 'signaling transport' layer moving signaling messages around • independent of any particular signaling application • NSIS Transport Layer Protocol (NTLP) • a 'signaling application' layer a set of state management rules • message formats and sequences, specific to a particular signaling application. • NSIS Signaling Layer Protocol (NSLP) IRT Group 06/10/04
More about NTLP - GIMPS • Based on existing transport and security protocols under a common messaging layer, the General Internet Messaging Protocol for Signaling (GIMPS) • Different signaling applications make use of different GIMPS services, but GIMPS does not keep state specific to signaling applications. • GIMPS manages its own internal state and configures underlying transport and security protocols to deliver signaling messages on behalf of signaling applications in both directions along the flow path IRT Group 06/10/04
GIMPS Operations • “Routing” - Determine how to reach the adjacent GIMPS peer along the data path • Downstream signaling: either use explicit local state information to determine the GIMPS peer, or just send the signaling towards the flow destination address and rely on the peer to intercept it. • Upstream signaling: has to rely on explicit local state information. • “Transport” - Deliver signaling information to the Peer • Datagram mode and Connection Mode IRT Group 06/10/04
GIMPS Datagram Mode • A mode of sending GIMPS messages between nodes without using any transport layer state or security protection • For small, infrequent messages with modest delay constraints • UDP as the initial choice • Upstream messages: UDP encapsulated and sent directly to the signaling destination • Downstream messages: set up router alert option and sent towards the flow receiver IRT Group 06/10/04
GIMPS Connection Mode • A mode of sending GIMPS messages directly between nodes using point to point "messaging associations“ i.e. transport protocols and security associations • For larger data objects or where fast setup in the face of packet loss is desirable, or where channel security is required • May use any stream or message-oriented transport protocol • May employ specific network layer security associations (e.g. IPsec), or an internal transport layer security association (e.g. TLS) IRT Group 06/10/04
Route Change Problem • In a connectionless network each packet is routed independently • packet route may change in the middle of a session. • Route change can happen due to various reasons: • Load sharing or load balancing • Policy based routing • Network reconfiguration • Routing protocol adaptation • Route changes cause a divergence between the data path and the path on which signaling state has been installed IRT Group 06/10/04
NSIS and Route Change • GIMPS (NTLP) and signaling application (NSLP) state needs to be updated • Local repair if possible • Not simply moving state from the old to the new path. Application dependent. E.g. QoS path characteristics. • GIMPS is responsible for: • Detecting the route change • Update its own routing state • Inform relevant signaling applications at affected nodes • A core issue is Route Change Detection • NTLP and NSLP soft-state mechanism is not sufficient especially when NSLP refresh times are extended to reduce signaling load IRT Group 06/10/04
Route Change Detection - Summary of Methods • Routing Monitoring: • Local Trigger delivered by the local routing table • Extended Trigger by checking a link-state routing table (OSPF, BGP with AS_PATH) • Packet Monitoring: • GIMPS C-mode Monitoring: TTL/Interface change • Data Plane Monitoring: TTL/Interface change or loss of flow all-together • GIMPS D-mode Probing • Each GIMPS node periodically repeats the discovery operation • Depending on the likely stability of routes, the discovery period can be set to a large value IRT Group 06/10/04
Route Change Detectability Upstream and Downstream Path • For a single node, route change may happen in its upstream path or downstream path • A change in downstream path at a node means a change in upstream path for its downstream peer. • Data Packet Monitoring may only be used to detect route change in the upstream path • Routing Monitoring is usually used to detect route change in the downstream path • C-mode signaling monitoring can be used to detect route change in both directions IRT Group 06/10/04
OSPF Area G New Path F B E D C Old Path Route Change DetectabilityCase Study Data flow direction G A E B F C D Square: NE circle: none – NE B: Divergence Node E: Convergence Node Upstream: F (data packet TTL monitoring) D (Data absence) Downstream: A (extended trigger) IRT Group 06/10/04
Route Change Detectability- Some Recommendations • By similarly studying other OSPF Intra-area/Inter-area, RIP Intra-AS, BGP Inter-AS route change cases, we found that: • To increase the chance of Route Change Detection, it is recommended to implement NE at least in all “special” routers, e.g., AS border routers, OSPF area border routers, OSPF backbone area routers IRT Group 06/10/04
Route Change in Real World ? • We want to know – statistically, • How frequently does a route change occur? • How long does it last? • How many hops does it affect? • How is it associated with TTL change? Or AS change? • … IRT Group 06/10/04
Network Measurement with Traceroute • Traceroute to record the path between two sites • Traceroute creates ICMP 'probes' datagrams with incrementing Time To Live (TTL) values. It continues to send out probes until it reaches the final destination. • Internet routes may change between two successive probe packets, there is no guarantee that probes of different hops take the same route as pervious hops • If self-consistent and shows no multiple routing for any of its hops, treat as a valid measurement. IRT Group 06/10/04
Traceroute Example Location: www.debug.net --to-- lava.net Status: ok Time: Fri Apr 09 07:12:23 EDT 2004 --to-- Fri Apr 09 07:12:29 EDT 2004 traceroute to lava.net (64.65.64.17), 30 hops max, 40 byte packets 1 fe0-0r0.ffm1.de.carpe.net (212.96.130.129) 0.907 ms 0.846 ms 0.775 ms 2 w1-0r0.ffm0.de.carpe.net (212.96.129.5) 2.938 ms 7.893 ms 6.443 ms 3 ge-3-1-0-4.fra20.ip.tiscali.net (213.200.64.37) 3.038 ms 7.679 ms 5.33 ms 4 so-3-0-0.was21.ip.tiscali.net (213.200.81.50) 99.954 ms 96.689 ms 94.49 ms 5 so-4-2-0.edge2.Washington1.Level3.net (4.68.127.61) 93.562 ms 97.801 ms 101.415 ms 6 so-1-1-0.bbr2.Washington1.Level3.net (64.159.3.65) 93.641 ms 93.743 ms 94.06 ms 7 ge-0-0-0.mpls1.Honolulu2.Level3.net (4.68.128.13) 221.634 ms 217.193 ms 221.168 ms 8 so-7-0.hsa1.Honolulu2.Level3.net (4.68.112.90) 218.791 ms 224.496 ms 224.18 ms 9 s1.lavanet.bbnplanet.net (4.24.134.18) 216.187 ms 216.262 ms 222.657 ms 10 malasada.lava.net (64.65.64.17) 221.802 ms 224.026 ms 220.563 ms IRT Group 06/10/04
Measurement Methodology • Obtained permission from 39 public traceroute servers located in US, Canada, Iceland, Netherlands, Finland, Austria, France, UK, Germany, Switzerland, Ireland, Bulgaria, Sweden, South Africa, New Zealand, Taiwan, Australia, Japan, Thailand • Data collection program written in Tcl. • At an exponentially distributed interval, the program: • Randomly selects two sites as source and destination and spawns a thread for each of them • Each thread sends a request to the selected source or destination and invokes a traceroute to the other party IRT Group 06/10/04
Two Data Sets • Data Set I is among 12 of these 39 sites with an measurement interval of 15 min, collected from Apr 09 – Apr. 24. Contains about 15,000 traceroute records • Data Set II is among all 39 sites but with a relatively long interval of 2 hours. It was started on Apr. 22 – May. 18. IRT Group 06/10/04
More on Data Set I • 12 sites that allowed the 15 minutes interval • www.valkaryn.net (LA, CA) • www.slac.stanford.edu (Stanford, CA) • lava.net (Honolulu, HI) • www.fh-friedberg.de (Frieberg, Germany) • www.lf.net (Germany) • swiCE2.switch.ch (Geneva, Switzerland) • Backbone.acad.bg (Sofia, Bulgaria) • Stockholm1.sunet.se (Stockholm, Sweden) • Traceroute.teragen.com.au (Melbourne, Australia) • www.hafey.org (Sydney, Australia) • www.megamirror.com (Sydney, Australia) • www.debug.net (Frankfurt, Germany) IRT Group 06/10/04
AS Mapping on Data Set I • We mapped each IP to its corresponding AS number by looking up RADB (Routing Assets Database) and its mirrored databases • For IPs that does not have a mapping entry: we check • If it has a hostname which identifies it as part of a site with a known AS • If no hostname found, we looked at the IP range and hops before and after in the same record, and make a best guess. IRT Group 06/10/04
Preliminary Processing on Data Set I • We abstracted from Data Set I the following records: • One IP address appears in multiple hops in the same record • A single hop in a record contains multiple different IP address IRT Group 06/10/04
Route Change Dynamics -One IP appears in multiple hops • Captured the middle of a routing change • E.g., based on the preceding and subsequent records, we can tell that one specific IP appears in two successive hops because one extra router has just been added upstream • Others more complicated, but usually a route change can be found. The duration of the dynamics varies from less than half an hour to several hours • Lost of connectivity to the repeating router’s following hop. (Last several hours?) • Skipping: packets originated from Switch.ch (Swiss Education and Research Network) seem to occasionally skip the first outgoing router within this site. (reason unknown) IRT Group 06/10/04
Route Change Dynamics -Multiple IPs in a single hop • 1194 instances • 1 unique triple – Caused by route change • 20 unique router pairs, among them: • 3 pairs (19 instances) due to Switch.ch skipping • 7 pairs (7 instances) during the middle of route changes • 2 pairs (2 instances) during a temporary network outage • The rest 8 pairs share the remaining 1165 instances IRT Group 06/10/04
Route Change Dynamics - Fluttering Behavior • The rapidly-variable routing exhibited by a few routers are called “Fluttering” by Vern Paxson in his 1997 thesis • This occurs on a very small time scale, essentially the time between successive traceroute probes • One possible mechanism that could cause this: A router alternates between multiple next-hop routers in order to split load among the links to those routers • As pointed out by Vern, such behavior is explicitly allowed in RFC 1812 “Requirements for IP Version 4 Routers” p.79. as load splitting, though the same document cautions that there are situations for which it is inappropriate. So it should at most be a configurable option for a router IRT Group 06/10/04
Ongoing Work • Statistical analysis of • More routing pathologies • Types of route changes • Route change durations • AS changes • TTL changes • Route symmetry • Route change detection algorithms IRT Group 06/10/04
Main References: • http://www.ietf.org/html.charters/nsis-charter.html • H. Schulzrinne, R. Hancock, GIMPS: General Internet Messaging Protocol for Signaling IETF Internet Draft, 2004 • R. Hancock et. al., Next Steps in Signaling: Framework IETF Internet Draft, 2003 • H. Schulzrinne et. al., Design of CASP – a Technology Independent Lightweight Signaling Protocol, 2003 • V. Paxson, Measurement and Analysis of End-to-end Internet Dynamics, Ph.D. thesis, UC Berkeley, 1997 IRT Group 06/10/04
The End Thank you ! IRT Group 06/10/04