610 likes | 740 Views
NIRA: A New Internet Routing Architecture. Xiaowei Yang MIT CSAIL yxw@mit.edu. Why NIRA?. Problems with today’s routing system No user choice Does not scale well Continuing growth of global routing state No fault isolation NIRA solves these problems. No User Choice!. Backbones.
E N D
NIRA: A New Internet Routing Architecture Xiaowei Yang MIT CSAIL yxw@mit.edu
Why NIRA? • Problems with today’s routing system • No user choice • Does not scale well • Continuing growth of global routing state • No fault isolation • NIRA solves these problems.
No User Choice! Backbones • Unlike in the telephone system, users cannot choose wide area providers separately from local providers. • Pricing, quality of service, security… • Wide area providers cannot offer differentiated services directly to users. • Quality of service Comcast Verizon 21 Million Broadband Subscribers in June 2003. Duopoly / Monopoly
We Want to Let Users Choose Domain-Level Routes • Our hypothesis: • User choice stimulates competition. • Competition fosters innovation. • Validation requires market deployment. • NIRA: the technical foundation. AT&T UUNET Local ISP
Problems with Current Inter-domain Routing • No user choice • Does not scale well • Continuing growth of global routing state • No fault isolation
Continuing Growth of Global Routing State • Real world requirements such as multi-homing are not well supported. Courtesy of http://bgp.potaroo.net/
No Fault Isolation • Local failure causes global routing update. • Routing loop and packet drops happen in transient state. • Routing convergence takes on the order of minutes.
Overview of NIRA • A scalable architecture that gives users the ability to select routes. • “User” is an abstract entity, e.g., software agent • “Domain-level” choices • Encourage ISP competition • Individual domain’s decision to offer “router-level” choices
Design Overview (1): Route Discovery Cindy N11 N12 N10 N13 N14 N9 N8 R8 R7 R9 R6 N7 N15 core N6 B4 B3 R5 N16 R10 B1 B2 N5 R4 N17 X N4 R1 R2 R3 N18 Alice Bob N1 N3 N2
Design Overview (2): Route Representation Cindy N11 N12 N10 N13 N14 N9 N8 R8 R7 R9 R6 N7 N15 core N6 B4 B3 R5 N16 R10 B1 B2 N5 R4 N17 N4 R1 R2 R3 N18 Alice Bob N1 N3 N2
Design Overview (3): Failure handling Cindy N11 N12 N10 N13 N14 N9 N8 • Will not discuss provider compensation in details. R8 R7 R9 R6 N7 N15 core N6 B4 B3 R5 N16 R10 B1 B2 N5 R4 N17 N4 R1 R2 R3 N18 Alice Bob N1 N3 N2
Cindy N11 N12 N10 N13 N14 N9 N8 R8 R7 R9 R6 N7 N15 core N6 B4 B3 R5 N16 R10 B1 B2 N5 R4 N17 N4 R1 R2 R3 N18 Bob N1 N3 Alice N2
System Components of NIRA • Addressing • Route discovery • Name-to-Route mapping • Route failure handling
Alice Bob NIRA’s Addressing Core B2 B1 1::/16 2::/16 • Strict provider-rooted hierarchical addressing • An address represents a valid route to the core. 1:3::/32 2:1::/32 1:1::/32 1:2::/32 R2 R3 R1 1:3:1::/48 2:1:1::/48 1:1:1::/48 1:2:1::/48 1:2:2::/48 N3 N1 N2 1:1:1::1000 1:2:1::1000 1:3:1::2000 2:1:1::2000
Alice Bob Why is NIRA’s Addressing Scalable? Core B2 B1 1::/16 2::/16 • Financial factors limit the size of core. • Provider hierarchy is shallow. • A domain has a limited number of providers. 1:3::/32 2:1::/32 1:1::/32 1:2::/32 R2 R3 R1 1:3:1::/48 2:1:1::/48 1:1:1::/48 1:2:1::/48 1:2:2::/48 N3 N1 N2 1:1:1::1000 1:2:1::1000 1:3:1::2000 2:1:1::2000
Alice Bob Efficient Route Representation Core B1 B2 1::/16 2::/16 • A source and a destination address unambiguously represent a common type of route. • General routes may use source routing headers. 1:3::/32 2:1::/32 1:1::/32 1:2::/32 R1 R2 R3 1:3:1::/48 2:1:1::/48 1:1:1::/48 1:2:1::/48 1:2:2::/48 N3 N1 N2 1:1:1::1000 1:2:1::1000 1:3:1::2000 2:1:1::2000
Alice Bob Routers’ Forwarding Tables Uphill table Core B1 B2 1::/16 2::/16 • Uphill table: providers • Downhill table: customers, self • Bridge table: all others 1:3::/32 2:1::/32 1:1::/32 1:2::/32 Downhill table R1 R2 R3 1:1:1::/48 1:2:1::/48 1:3:1::/48 2:1:1::/48 1:2:2::/48 N3 N1 N2 1:1:1::1000 1:2:1::1000 1:3:1::2000 2:1:1::2000
1:1:1::1000 up down 1:3:1::2000 Alice Bob Basic Forwarding Algorithm Core B1 B2 1::/16 2::/16 • Look up destination address in the downhill table. If no match, • Look up the source address in the uphill table. 1:3::/32 2:1::/32 1:1::/32 1:2::/32 R1 R2 R3 1:3:1::/48 2:1:1::/48 1:1:1::/48 1:2:1::/48 1:2:2::/48 N3 N1 N2 1:3:1::2000 2:1:1::2000 1:1:1::1000 1:2:1::1000
System Components of NIRA • Addressing • Route discovery • Topology Information Propagation Protocol (TIPP) • Name-to-Route mapping • Failure handling
Alice Bob What does TIPP Do? Core B1 B2 X • Propagates addresses • Propagates “up-graph”: providers and their inter-connections on a user’s routes to the core. R1 R2 R3 N3 N1 N2 1:1:1::1000 1:2:1::1000 temporarily unusable 2:2:1::1000
What is TIPP Like? • Link-state style protocol • Supports scoped propagation • Provides a consistent view of network • A simple, proven to be correct algorithm [SG89] • No sequence numbers, no periodic refreshments, no timestamps • Fast convergence
Why is TIPP Scalable? (1) Cindy • Up-graph is small. N11 N12 N10 N13 N14 N9 N8 R8 R7 R9 R6 N7 N15 core N6 B4 B3 R5 N16 R10 B1 B2 N5 R4 N17 N4 R1 R2 R3 N18 Bob N1 N3 Alice N2
Why is TIPP Scalable? (2) Cindy • Scoped propagation fault isolation N11 N12 N10 N13 N14 N9 N8 R8 R7 R9 R6 N7 N15 core N6 B4 B3 R5 N16 R10 B1 B2 N5 R4 N17 X N4 R1 R2 R3 N18 Bob N1 N3 Alice N2
System Components of NIRA • Addressing • Route discovery • Name-to-Route mapping • Failure handling
Alice Bob Name-to-Route Lookup Service (NRLS) Foo.com server Core Alice.foo.com 1:3:1::2000 2:1:1::2000 B1 B2 • An enhanced DNS service R1 R2 R3 N3 N1 N2 1:1:1::1000 1:2:1::1000 1:3:1::2000 2:1:1::2000 1:3:1::2000 2:1:1::2000 Alice.foo.com 1:3:1::2000 2:1:1::2000
System Components of NIRA • Addressing • Route discovery protocol • Name-to-Route mapping • Failure handling
Alice Bob How Route Failures are Handled Foo.com server Core B1 B2 • A combination of TIPP notifications and router feedbacks or timeouts. • Switching addresses to switch to a different route • HIP [Moskowitz04], SCTP [RFC 2960], TCP migrate [Snoeren00] X X R1 R2 R3 Alice.foo.com 1:3:1::2000 2:1:1::2000 N3 N1 N2 1:1:1::1000 1:2:1::1000
NIRA Solves these Problems: • User choice • Choosing addresses choosing routes choosing providers • Scalability • Modularized route discovery. • Constrained failure propagation.
Data Sets • Domain-level topologies from BGP routing tables • Inferred domain relationships [Gao00] [Subramanian02] • Not completely accurate, but best practice. • Data from 2001 to 2004 [Agarwal]
Evaluation of Scalability • Methodology • Measure the amount of state each domain keeps assuming NIRA • Number of providers in the core • Number of address prefixes • Size of up-graphs • Size of forwarding tables • Conclusions • Scalable in practice
Provider-rooted Hierarchical Addressing is Practical. • In practice: • Level of provider hierarchy (h) is shallow. • A domain has a limited number (p) of providers. • In theory: ph
Up-graphs are Small. • Analysis • Level of provider hierarchy: h; • Number of a domain’s providers: p; • Number of a domain’s peers: q; • i=1h pi-1(p+q)
Evaluation of TIPP in Dynamic Networks • Methodology • Packet-level simulations using sampled topologies • Conclusion • Low communication cost • Fast convergence
Communication Cost of TIPP is Low. Convergence: No message churning • Average: total messages (bytes) / link / failure • Maximum: max seen messages (bytes) over one link / failure Scalable: scoped propagation
TIPP Converges Fast • Link delay uniformly distributes in [10ms, 110ms]. • Single failure convergence time is proportional to the shortest path propagation delay.
Conclusion • User choice • Choosing addresses choosing routes choosing providers • Scalability • Modularized route discovery. • Constrained failure propagation. • Evaluation shows NIRA is practical. • Looking forward • New provider compensation model • Stable routing with user choice • Deployment of NIRA
Core Routing Region is Scalable • Financial factors limit the size of core.
Forwarding Tables • No need to dynamically compute paths to reach a prefix. • Common case: • Small • Analysis: • Number of prefixes: r • Number of customers: c • Number of peers: q • r + r + r*c + q
TIPP in Dynamic Networks • Simulation topologies • Pick random leaf domains • Recursively include their providers and peers
Workload Analysis of NRLS Update • A fundamental tradeoff: • Topology change will cause address change • Root servers reside in top-level providers. • Route record updates: mimic a renumbering event • How often can route update happen? • Route server processing time: 1ms per update • 1000 updates / second / server • 100,000 updates ! 100 seconds » 2 minutes • Bandwidth: 5% of 100Mb/s for 100,000 users, 100 bytes per update, ! 625 updates / second ! 160 seconds » 3 minutes to update all users • Route update causes: contractual or physical topology change • Could be scheduled • Allow for a grace period, say 30 minutes • Conclusion: manageable
B1 B2 Aggregation 12.0.0.0/16 R1 10.0.0.0/16 R3 12.0.0.0/24 R2 N1 R1 R2 R3 12.0.0.0/24 N1 10.0.0.0/24 N3 N1 N2 N3 Architectural Problem in the Internet Adds one entry into BGP tables
b000::/16 a000::/16 Core B1 B2 a000:1::/32 a000:2::/32 a000:3::/32 b000:1::/32 R1 R3 R2 b000:1:1::/48 a000:1:1::/48 a000:2:1::/48 a000:2:2::/48 a000:3:1::/48 N3 N1 N2 b000:1:1::/96 a000:3:1::/96 a000:1:1::/96 a000:2:1::/96 a000:1:1::1000 a000:2:1::1000 1111:N1::1000 b000:1:1::1000 a000:3:1::1000 1111:N3::1000 Hierarchy address: Non-hierarchy address:
open B open request 216 request 0 add (a000:1::/32, 3 days) add (a000:1::/32, 3 days) add (a000:1:1::/48, 1 day) add (a000:1:1::/48, 1 day) add ( ) Address Allocation • A hierarchical address prefix is a leased resource. • survives connection breakdown and node failures. • periodic lease renewal. • de-allocated when lease expires. B R N Established a000::/16 Address allocation decision making a000:1::/32 R a000:1:1::/48 N Time
open B open edge (R, B) request 216 request 0 edge (R, B) add (a000:1::/32, 3 days) add (a000:1:1::/48, 1 day) add ( ) Edge Record Origination and Distribution B R N a000::/16 • One directional attributes exchanged during connection setup. • Topology update procedure distributes edge records to neighbors. a000:1::/32 R Internal reachable B → R: a000:1::/32 External reachable B → R: ε Internal reachable B→ R: a000:1::/32 External reachable B → R: ε Internal reachable R→ B: a000::/32 External reachable R → B: * Topology update procedure a000:1:1::/48 edge (B, R) N Time
R1 R2 (P1, P2, down) (P1, P2, up) N Ensuring Edge Record Consistency • Common techniques • Sequence numbers (OSPF, IS-IS) • Flooding • Timestamps • Loosely synchronized clocks • Modified Shortest Path Topology Algorithm (SPTA) [SG89] • Pros • No sequence numbers or timestamps • Simple, proven correct • Cons • Computation cost per update. • Communication cost in a theoretical worst case is unbounded.
Scalable Route Discovery with Transit Policies Net3 Net4 • Addressing must take policies into consideration. • A natural solution: provider-rooted hierarchical addressing [Tsuchiya91] [Francis94] Backbone1 Backbone2 X ISP1 ISP2 peer peer Net1 Net2 provider customer