260 likes | 277 Views
This project overview discusses the motivation and need to design networks for worst-case conditions, using extreme network services such as Lightweight Flow Setup (LFS), Network Access Service (NAS), and Reserved Tree Service (RTS). It also highlights key router technology components like Super-Scalable Packet Scheduling (SPS), Dynamic Queues with Auto-aggregation (DQA), and Scalable Distributed Queueing (SDQ) to achieve nonstop network operation under extreme traffic conditions.
E N D
Extreme NetworkingAchieving Nonstop Network Operation Under Extreme Operating ConditionsDARPA PI Meeting, January 27-29, 2003 Jon Turnerjst@cse.wustl.eduhttp://www.arl.wustl.edu/arl
Project Overview • Motivation • data networks have become mission-critical resource • networks often subject to extreme traffic conditions • need to design networks for worst-case conditions • technology advances making extreme defenses practical • Extreme network services • Lightweight Flow Setup (LFS) • Network Access Service (NAS) • Reserved Tree Service (RTS) • Key router technology components • Super-Scalable Packet Scheduling (SPS) • Dynamic Queues with Auto-aggregation (DQA) • Scalable Distributed Queueing (SDQ)
ControlProcessor Switch Fabric IPP OPP IPP IPP IPP IPP OPP OPP OPP OPP IPP OPP FPX FPX FPX FPX FPX FPX SPC SPC SPC SPC SPC SPC Line Card Line Card Line Card Line Card Line Card Line Card Prototype Extreme Router
ControlProcessor Switch Fabric IPP OPP IPP IPP IPP IPP OPP OPP OPP OPP IPP OPP FPX FPX FPX FPX FPX FPX SPC SPC SPC SPC SPC SPC Line Card Line Card Line Card Line Card Line Card Line Card Prototype Extreme Router
Prototype Extreme Router ControlProcessor Field Programmable Port Ext. SDRAM128 MB SRAM4 MB Switch Fabric IPP OPP IPP IPP IPP IPP OPP OPP OPP OPP IPP OPP FPX FPX FPX FPX FPX FPX ATM Switch Core SPC SPC SPC SPC SPC SPC Field Programmable Port Extenders ReprogrammableApplicationDevice NetworkInterfaceDevice Line Card Line Card Line Card Line Card Line Card Line Card
Prototype Extreme Router ControlProcessor Switch Fabric Smart Port Card 2 IPP OPP IPP IPP IPP IPP OPP OPP OPP OPP IPP OPP Pentium 128 MB FlashDisk FPX FPX FPX FPX FPX FPX Cache FPGA NorthBridge APIC SPC SPC SPC SPC SPC SPC Line Card Line Card Line Card Line Card Line Card Line Card Embedded Processors
ControlProcessor Gigabit Ethernet Switch Fabric IPP OPP IPP IPP IPP IPP OPP OPP OPP OPP IPP OPP FPGA Framer FPX FPX FPX FPX FPX FPX GBIC SPC SPC SPC SPC SPC SPC Line Card Line Card Line Card Line Card Line Card Line Card Prototype Extreme Router
Performance of SPC-2 Largest gain at small packet sizes. PCI bus limits performance for large packets
More SPC-2 Performance Throughput loss at high loads due to PCI bus contention and input priority.
ReprogrammableApp. Device(400 Kg+80 KB) SRAM(1 MB) SDRAM (64 MB) 64 36 SRAM(1 MB) SDRAM (64 MB) 36 64 100 MHz 100 MHz 6.4 Gb/s NetworkInterfaceDevice üýþ üýþ 2 Gb/sinterface 2 Gb/sinterface Field Programmable Port Extender (FPX) • Network Interface Device(NID) routes cells to/from RAD. • Reprogrammable Application Device(RAD) functions: • will implement core router functions in extensible router • may also implement arbitrary packet processing functions • Functions for extreme router. • high speed packet storage manager • packet classification & route lookup • fast route lookup • exact match filters • 32 general filters • flexible queue manager • per-flow queues for reserved flows • route packets to/from SPC
DQ virtual output queues PacketClassification & RouteLookup ... ... Output Side Processing reassembly contexts RC special flow queues output queues ... ... ... ... FPX PacketClassification SPC reassembly contexts special flow queues PCU FPX plugins ... ... SPC Input Side Processing PCU plugins Logical Port Architecture
SDRAM SDRAM ISAR Packet Storage Manager (includes free space list) OSAR from LC to LC from SW to SW Data Path Discard HeaderPointer SRAM Register Set SRAM Pointer Queue Manager Classification and Route Lookup Header Proc. Control Control Cell Processor Route &FilterUpdates Register Set Updates & Status DQ Status & Rate Control FPX Packet Processor Block Diagram
RouteLookup FlowFilters Result Proc. & Priority Resolution Input Demux GeneralFilters headers bypass Classification and Route Lookup (CARL) • Three lookup engines. • route lookup for routing datagrams - best prefix • flow filters for multicast & reserved flows - exact • general filters (32) for management - exhaustive • Input processing. • parallel check of all three • return highest priority exclusive and highest priority non-exclusive • general filters have unique priority • all flow filters share single priority • ditto for routes • Output processing. • no route lookup on output • Route lookup & flow filters share off-chip SRAM • General filters processed on-chip
tag+data -- packet . . . src dst tag+data 1 1 tag+data -- 6 5 0 1 tag+data -- 1 0 simple hash 0 0 1 1 off-chip SRAM . . . on-chip SRAM Exact Match Lookup • tag =[src,dst,sport, dport,proto] • data includes • 2 outputs+2 QIDs • LFS rates • packet,byte counters • flags ingress validegress valid • Exact match lookup table used for reserved flows. • includes LFS, signaled QOS flows and multicast • and, flows requiring processing by SPCs • each of these flows has separate queue in QM • multicast flows have two queues (recycling multicast) • implemented using hashing separate memory areas for ingress and egress packets
filtermemory matcher matcher matcher matcher General Filter Match • General filter match considers full 5-tuple • prefix match on source and destination addresses • range match on source and destination ports • exact or wildcard match on protocol • each filter has a priority and may be exclusive or non-exclusive • Intended primarily for management filters. • firewall filters • class-based monitoring • class-based special processing • Implemented using parallel exhaustive search. • limit of 32 filters
01,10 0 00 0110 11101110 010 101 000 001 100 110 1,10 -- 11 -- 1 * 0 01 0010 00000000 0 00 0000 00001000 0 00 0001 00010010 0 00 0000 00000010 0 01 0000 00001100 1 00 0000 00000000 100 011 110 110 100 101 internalbit vector * 0,00 01 00 1,11 0 externalbit vector 1 00 0000 00000000 0 00 1000 00000000 0 00 0100 00000000 0 10 1000 00000000 0 01 0001 00000000 0 10 0000 00000000 Fast IP Lookup (Eatherton & Dittia) address: 101 100 101 000 • Multibit trie with clever dataencoding. • small memory requirements (<7 bytes per prefix) • small memory bandwidth, simple lookup yields fast lookup rates • updates have negligible impact on lookup performance • Avoid impact of external memory latency on throughput by interleaving several concurrent lookups. • 8 lookup engine config. uses about 6% of Virtex 2000E logic cells
SRAM Bandwidth – 450 MB/s Lookup Throughput Split tree cuts storage by 30% linearthroughput gain
Update Performance reasonable update rates have little impact 1 update per ms
to output 0 res. flow queues arriving packets DQ res. flow queues ... VOQ pkt. sched. ... datagram queue to link link pkt. sched. to switch to output 1 ... to output 8 datagram queues ... SPC pkt. sched. to SPC from SPC Queue Manager Logical View (QM) separate queues for each reserved flow separate queue set for each output. separate queue for each SPC flow 64 hashed datagram queues for traffic isolation
Backlogged TCP Flows with Tail Discard with large buffers get large delay variance with small buffers get underflow and low throughput
DRR with Discard from Longest Queue • Smaller fluctuations, but still significant.
Queue State DRR low variation, even with small queues, low delay, no tuning • Add hysteresis to packet discard policy • discard from same queue until shortest non-empty queue.
00101010 10000010 fast forward bits 00110100 output list Packet Scheduling with Approx. Radix Sorting wheel 1 wheel 2 wheel 3 • To implement virtual time schedulers, need to quickly find the queue whose “lead packet” has the smallest virtual finish time. • using priority queue, this requires O(log n) time for n queues • Use approximate radix sorting, with compensation – O(1). • timing wheels with increasing granularity and range • approximate sorting produces inter-packet timing errors • observe errors & compensate when next packet scheduled • Fast-forward bits used to skip to empty slots. • Scheduler puts no limit on number of queues. • Two copies of data structure needed for approx. version of WF2Q+.
Resource Usage Estimates • Key resources in Xilinx FPGAs • flip flops - 38,400 • lookup tables (LUTs) - 38,400 • each can implement any 4 input Boolean function • block RAMs (4 Kbits each) - 160
Summary • Version 1 Hardware status. • hardware operating in lab, passing packets • but, still have some bugs to correct • one day for typical test-diagnose-correction cycle • version 1 has simplified queue manager • Planning several system demos in next month. • system level throughput testing – focus on lookup proc. • verifying basic fair queueing behavior • TCP SYN attack suppressor • SPC-resident plugin monitors new TCP connections going to server • when too many “half-open” connections, oldest are reset • flow filters inserted for stable connections, enabling hw forwarding • Expect to complete version 2 hardware in next six months.