1 / 22

Experience in Black-box OSPF Measurement

Experience in Black-box OSPF Measurement. Aman Shaikh, UCSC Albert Greenberg, AT&T Labs-Research Sigcomm IMW – November 2001. Why Measure OSPF?. OSPF behavior in large ISPs not well understood, yet any meaningful performance assurance depends on routing stability

Download Presentation

Experience in Black-box OSPF Measurement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experience in Black-box OSPF Measurement Aman Shaikh, UCSC Albert Greenberg, AT&T Labs-Research Sigcomm IMW – November 2001 Sigcomm IMW - 2001

  2. Why Measure OSPF? • OSPF behavior in large ISPs not well understood, yet • any meaningful performance assurance depends on routing stability • an internal network change (OSPF event) can have major impact on flows and customers, during which • intra-domain routing reconverges • inter-domain routing reconverges (BGP uses OSPF metrics) • Internal OSPF processing delays matter! • message processing, routing calculation, table update • add up to impact convergence, instabilities • OSPF measurements also needed for • guidance in tuning configurable parameters • head to head vendor comparisons Sigcomm IMW - 2001

  3. How to Measure OSPF? • Problem: Instrumenting routing code for measuring delays is challenging • commercial implementations are proprietary • may involve grappling with • numerous code versions, hardware platforms, and developers • Solution: black-box measurements • measure the timing delays using external observations • Contribution: black-box measurements for internal OSPF delays • applied to Cisco and GateD OSPF implementations • Key prior work: • IS-IS measurements by Packet Design [draft-alaettinoglu-ISIS-convergence-00] Sigcomm IMW - 2001

  4. Black-box Techniques are Effective • Works across wide range of timing delays • 100 sec for packet processing • 10s of msec for routing calculation • Works even for the purely CPU bound tasks • packet processing subtasks, Dijkstra’s shortest path calculation • Captures scaling • O(n2) time for shortest path calculation, for full n  n mesh topologies Sigcomm IMW - 2001

  5. OSPF Background • Link-state routing protocol • all routers in the domain come to a consistent view of the topology by exchange of Link State Advertisements (LSAs) • set of LSAs (self-originated + received) at a router = topology • SPF Calculation • each router calculates a single source shortest path tree • Forwarding Information Base (FIB) • each router uses the tree to build its FIB, which governs packet forwarding Sigcomm IMW - 2001

  6. Link-state Advertisement (LSA) • LSA propagation: each router • describes local connectivity in an LSA • floods LSA to other routers in the domain • acknowledges LSA in an LS Ack packet • Duplicate LSAs: each router • can receive multiple copies of a given LSA • first copy received is termed “new” • copies received later are termed “duplicate” • Duplicate LSAs MUST be acknowledged immediately (RFC2328) • allows us to build a timestamp Sigcomm IMW - 2001

  7. SPF Calculation LSA LSA Data packet LS Ack Data packet Router Model LSA Processing Route Processor (CPU) OSPF Process LSA Flooding Topology View SPF Calculation FIB Update FIB Forwarding Forwarding Switching Fabric Interface card Interface card Sigcomm IMW - 2001

  8. Emulated topology LSA LSA LSA Methodology Target router TopTracker Testbed • Load emulated topology on target router • Initiate task of interest • Measure the time for task Sigcomm IMW - 2001

  9. B time A X C Measuring Task Time • Use a black-box method to bracket task start and finish times • Subtract out intervals that precede and exceed these times top bracket event task start time task finish time bottom bracket event X = A - (B+C) Sigcomm IMW - 2001

  10. Load desired topology TopTracker Target Router Send initiatorLSA B C Send duplicate LSA A X E D Send ack for duplicate LSA Methodology for SPF Calculation Initiator LSA arrives SPF calculation starts time SPF calculation ends Ack for duplicate LSA arrives • X = A – (B + C + D + E) • Estimate the overhead = B + C + D + E Sigcomm IMW - 2001

  11. Remove SPF calculation from bracket spf_delay = 60 seconds TopTracker Target Router B Send initiator LSA Send duplicate LSA C Overhead D E Duplicate LSA processing done; send ack Estimating the Overhead Initiator LSA arrives Duplicate LSA arrives time Initiator LSA processing done Ack for duplicate LSA arrives SPF calculation starts overhead = B + C + D + E Sigcomm IMW - 2001

  12. Results • Results for Cisco GSR, 7513 and GateD • for GateD, comparison of black-box results with those obtained using instrumentation (white-box) • route processors • Cisco: 200 MHz R5000 processor • GateD: 500 MHz AMD-K6 processor • Topology used is a full n  n mesh with random OSPF edge weights • vary n in the range 10, 20, …, 100 Sigcomm IMW - 2001

  13. Results for Cisco Routers • Similar results for two models • SPF calculation time is O(n2) Sigcomm IMW - 2001

  14. Results for GateD • Black-box over-estimates white-box measurement • Black-box captures the characteristics very well Sigcomm IMW - 2001

  15. OSPF Task Delays (Cisco) • LSA Processing • 100-800 microseconds • LSA flooding • 30-40 milliseconds • pacing timer is the determining factor • SPF calculation • 1-40 milliseconds • O(n2) behavior for full n x n mesh • FIB update time • 100-300 milliseconds • no dependence on the size of the topology Sigcomm IMW - 2001

  16. Toolkit • Use of topology emulator • loads topologies • generates specific patterns of LSAs • Use of protocol dynamics mandated by standards • duplicate LSA mechanism: OSPF is required to ack a duplicate LSA immediately • useful for estimating end-point of tasks like SPF calculation • Use of vendor-specific parameters: • spf_delay • spf_holdtime • Pacing timer Sigcomm IMW - 2001

  17. Conclusions • Black-box methods for estimating OSPF processing delays: • LSA processing and flooding • SPF calculation and FIB Update • Applied techniques to Cisco GSR and 7513 routers as well as GateD • Black-box methods worked • Future work • develop techniques for other protocols, in particular BGP Sigcomm IMW - 2001

  18. Backup Sigcomm IMW - 2001

  19. OSPF Overview : Example A A 1 1 B B 1 1 1 1 1 E D D E C C 1 1 2 1 1 1 1 G 3 F F G 1 2 1 2 I H I H 1 1 1 J J SPT at G OSPF Domain (single area) Sigcomm IMW - 2001

  20. new duplicate Update topology view Acknowledge LSA immed. Send LS Ack packet back Schedule SPF calc. if reqd. LSA Processing over SPF Calculation Flood the LSA out LSA Processing Receive an LSA New/duplicate? paced by hold-down timer (spf_delay) Sigcomm IMW - 2001

  21. SPF calculation starts SPF calculation ends FIB is updated SPF Calculation LSA Processing over Sigcomm IMW - 2001

  22. Internal OSPF Tasks to Measure • Processing Link State Advertisements (LSAs) • Flooding LSAs • Performing SPF calculation • described in this talk • Updating the Forwarding Information Base (FIB) Sigcomm IMW - 2001

More Related