1 / 26

Block Design Review: Line Card Key Extract (Ingress and Egress)

Block Design Review: Line Card Key Extract (Ingress and Egress). Michael Wilson mlw2@arl.wustl.edu http://www.arl.wustl.edu/projects/techX. Revision History. 10/10/06 (MLW): Released. Contents. slide taken from PlanetLab_Design.ppt. Lookup. Hdr Format. QM/Schd. Switch Tx. S W I T

brets
Download Presentation

Block Design Review: Line Card Key Extract (Ingress and Egress)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Block Design Review:Line Card Key Extract(Ingress and Egress) Michael Wilson mlw2@arl.wustl.edu http://www.arl.wustl.edu/projects/techX

  2. Revision History • 10/10/06 (MLW): • Released

  3. Contents • slide taken from PlanetLab_Design.ppt Lookup Hdr Format QM/Schd Switch Tx S W I T C H Phy Int Rx Key Extract QM/Schd Phy Int Tx Hdr Format Lookup Key Extract Switch Rx • For both Ingress and Egress Key Extract: • overview • block diagram • code locations • test procedures • implementation status • performance analysis

  4. Key Extract Overview (Both) • For both Ingress and Egress Key Extract, we follow the same general method to lower the cycle budget. • Issue a DRAM read of the first portion of the packet • Issue a second DRAM read that encompasses all possible IP Options • Process any outputs that do not rely on the packet • Wait for the first DRAM read • Process everything before the IP Options • Set up the index register for anything after the IP Options • Wait for the second DRAM read • Process the remaining data • Both are written in microcode, not C

  5. Egress Key Extract

  6. QM/Schd Egress Key Extract • slide taken from PlanetLab_Design.ppt S W I T C H Phy Int Tx Lookup Hdr Format Key Extract Switch Rx • Main functions: • Extract lookup keys from packet payload and pass to Lookup block • Note: no validation! Outbound packets are assumed to be correct. • Single code path • NN communication • May need scratch ring communication for exception path • Uses 8 threads

  7. IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) QM/Schd LC Egress: Functional Blocks Phy Int Tx Lookup S W I T C H Hdr Format Key Extract Switch Rx Buf Handle(32b) DstAddr (6B) Buf Handle(32b) Ethernet Header SrcAddr (6B) Eth. Frame Len (16b) Reserved (12b) Port (4b) Type=802.1Q (2B) VLAN (2B) IP DAddr (32b) Type=IP (2B) Lookup Key IP Proto (8b) Lookup Key – UDP SPort (16b) Reserved (8b) Ver/HLen/Tos/Len (4B) ID/Flags/FragOff (4B) TTL (1B) Protocol = UDP (1B) Hdr Cksum (2B) Src Addr (4B) IP Header Dst Addr (4B) IP Options (0-40B) Indicates fields that need to be read Src Port (2B) UDP Header Dst Port (2B) Indicates 8-Byte Boundaries Assuming no IP Options UDP length (2B) UDP checksum (2B) UDP Payload (MN Packet) PAD (nB) Ethernet Trailer CRC (4B) • slide taken from PlanetLab_Design.ppt

  8. Egress KE Block Diagram 3 cycles (once NN ring ready) dl_source_1ME_NN_2words() egress_key_extract() Start DRAM Reads 10 cycles until ctx_arb Set Eth Hdr Len Wait for first Read Load IP Len, IP Dest Addr, IP Proto, and IHL 9 cycles (counting branch) IP Options No IP Options Read UDP Sport 4 12 Prepare Indexed Read 19 cycles (in thread) Wait for second Read Wait for second Read 1 1 dl_sink_1ME_NN_4words() Read UDP Sport Set BlockID

  9. File locations (in …/LC_Egress/) • Code • src/key_extractor/PL/key_extract.uc • Includes • ../dispatch_loop/dl_source_WU.uc • dl_source() and dl_sink() functions • Stubs/PL/dispatch_loop/dl_system.h • functions for ordered thread synchronization

  10. Egress Key Extract Validation • All validation tests done with 8 threads • Because the Egress can assume all outgoing packets are valid, no testing needs to be done for invalid packets • Unit Testing flow is …/src/key_extractor/PL/test/eth_top.flw • Sends VLAN frames only, with UDP/IP payload, incrementing payload size • No IP Options in this set • Verified all fields of output ring data were as expected • No full speed test in simulation yet

  11. Egress Key Extractor Other • Initialization • The PL Egress Key Extractor needs no initialization • Bugs • None known! • Untested • Hardware – this block has never been tested on hardware. • Optimizations still available • Many minor re-ordering optimizations still available, but these are unnecessary (see performance!) • To be done • The source still needs cleanup and commentary. • The T_INDEX code needs factoring for clarity. • Some local constants (E.g., the DRAM packet offset of 384) should be moved to a shared file.

  12. Egress Key Extractor Other • Performance • KE has CPU time of roughly 54 cycles. (Some time in other threads during unnecessary context yield not counted). No CPU optimization is necessary. • I/O latency for a single packet (no contention) was 562 cycles. • Total time for a single packet (from simulation) was 606 cycles. • Egress KE is under budget on paper; still need to do tests for contention and RX synchronization. • Expect Engress Key Extract to be extremely sensitive to DRAM contention!

  13. Ingress Key Extractor

  14. QM/Schd LC Ingress Key Extractor Lookup Hdr Format Switch Tx S W I T C H Phy Int Rx Key Extract Buf Handle(32b) Buf Handle(32b) Eth Hdr Len (8b) IP Pkt Length (16b) Reserved (8b) Eth. Frame Len (16b) Reserved (12b) Port (4b) Lookup Key[63-32] (32b) Lookup Key[ 31-0] (32b) • Main function • Extracts lookup key. • Lookup Key (64b): • SL Type (4b): 0101b • Port (4b): from RX • IP DAddr (32b) • IP Proto (8b) • UDP DPort (16b) • Separate code path for each Substrate Link type (including VLAN) • NN communication • Will use scratch ring communication for exception path • Uses 8 threads • slide taken from PlanetLab_Design.ppt

  15. SL(4b) 0101 Port (4b) IP DAddr (32b) IP Proto (8b) UDP DPort (16b) PlanetLab Ingress LC Input Frame • New PlanetLab Substrate Link Type: • Configured SL Type • LC is told at boot/init time that this is its one and only SL Type. • Similar to the way P2P-DC is handled. • SL Type: 0101b • Port: May be a don’t care • IP DAddr: Verifies that packet is for our node • IP Proto = UDP • Could be a UDP tunnel to a slice • UDP DPort: Indicates which slice • Default route is to the GPE • Key = • SL=0101b • Port: May be a don’t care. • IP DAddr = our node address DstAddr (6B) Ethernet Header SrcAddr (6B) DstAddr (6B) Type=802.1Q (2B) SrcAddr (6B) VLAN (2B) Type=IP (2B) Type=IP (2B) Ver/HLen/Tos/Len (4B) Ver/HLen/Tos/Len (4B) ID/Flags/FragOff (4B) ID/Flags/FragOff (4B) TTL (1B) TTL (1B) Protocol = UDP (1B) Protocol = UDP (1B) Hdr Cksum (2B) Hdr Cksum (2B) Src Addr (4B) Src Addr (4B) IP Header Dst Addr (4B) Dst Addr (4B) IP Options (0-40B) IP Options (0-40B) Src Port (2B) Src Port (2B) UDP Header Dst Port (2B) Dst Port (2B) UDP length (2B) UDP length (2B) UDP checksum (2B) UDP checksum (2B) UDP Payload (MN Packet) UDP Payload (MN Packet) slide taken from PlanetLab_Design.ppt PAD (nB) PAD (nB) Ethernet Trailer CRC (4B) CRC (4B) PlanetLab IPv4 Key(0x5) (64 bits)

  16. Ingress KE Block Diagram dl_source_1ME_NN_2words() ingress_key_extract() Start DRAM Reads 13 cycles until ctx_arb Set SL type & port Wait for first Read 4 cycles (discounting branch) Check Ethernet Type VLAN Other IPv4 Get IP Len, Eth Hdr Len, IP Daddr, IP Proto; setup for UDP dport read Get IP Len, Eth Hdr Len, IP Daddr, IP Proto; setup for UDP dport read Drop 20cycles Wait for second Read Wait for second Read Wait for second Read 1 1 Read UDP Sport Read UDP Sport dl_sink_1ME_NN_4words() Set Next BlockID

  17. File locations (in …/LC_Ingress/) • Code • src/key_extractor/PL/key_extract.uc • Includes • ../dispatch_loop/dl_source_WU.uc • dl_source() and dl_sink() functions • Stubs/PL/dispatch_loop/dl_system.h • functions for ordered thread synchronization

  18. Ingress Key Extract Validation • All validation tests done with 8 threads • Unit Testing flow is …/src/key_extractor/PL/test/eth_top.flw • Sends VLAN frames alternating with non-VLAN frames, with UDP/IP payload, incrementing payload size • Verified all fields of output ring data were as expected • No drops in this set • ARP • Invalid packets • No IP Options in this set • No full speed test in simulation yet

  19. Ingress Key Extractor Other • Initialization • The PL Ingress Key Extractor needs no initialization • Bugs • None known! • Untested • Hardware – this block has never been tested on hardware. • Optimizations still available • Many minor re-ordering optimizations still available, but these are unnecessary (see performance!) • To be done • The source still needs cleanup and commentary. • The T_INDEX code needs factoring for clarity. • Some local constants (E.g., the DRAM packet offset of 384) should be moved to a shared file. • RFC 1812 packet validation • Arp packets being dropped instead of forwarded to XScale

  20. Ingress Key Extractor Other • Performance • KE has CPU time of roughly 60 cycles. (Some time in other threads during unnecessary context yield not counted). No CPU optimization is necessary. • I/O for a single packet (no contention) was 422 cycles. • Total time for a single packet (from simulation) was 599 cycles. • Ingress KE is under budget on paper; still need to do tests for contention and RX synchronization. • Expect Ingress Key Extract to be extremely sensitive to DRAM contention!

  21. Image Slide Template

  22. Text Slide Template

  23. Extra Slides

  24. Cycle Budget (min eth packets) • To hit 5Gb rate: • 76B per min IPv4 packet (64 min Eth + 12B IFS) • 1.4Ghz clock rate • 5 Gb/sec * 1B/8b * packet/76B = 8.22 Mp/sec • 1.4Gcycle/sec * 1 sec/ 8.22 Mp = 170.3 cycles per packet • compute budget: 170 cycles • latency budget: (threads*170) • 4 threads : 680 cycles • 8 threads: 1360 cycles

  25. Cycle Budget (IPv4 MN packets) • To hit 5Gb rate: • 90B per min IPv4 packet (78 min IPv4MN + 12B IFS) • 1.4Ghz clock rate • 5 Gb/sec * 1B/8b * packet/90B = 6.94 Mp/sec • 1.4Gcycle/sec * 1 sec/ 6.94 Mp = 201.7 cycles per packet • compute budget: 201 cycles • latency budget: (threads*201) • 4 threads : 804 cycles • 8 threads: 1608 cycles

  26. RFC 1812 5.2.2 IP Header Validation • The packet length reported by the Link Layer must be large enough to hold the minimum length legal IP datagram (20 bytes) (2) The IP checksum must be correct. (3) The IP version number must be 4. If the version number is not 4 then the packet may be another version of IP, such as IPng or ST-II. 4) The IP header length field must be large enough to hold the minimum length legal IP datagram (20 bytes = 5 words). (5) The IP total length field must be large enough to hold the IP datagram header, whose length is specified in the IP header length field. from http://www.faqs.org/rfcs/rfc1812.html, by way of Brandon Heller’s Block Review

More Related