200 likes | 342 Views
SPP Version 1 NAT Daemon (natd). Mart Haitjema. NATD Overview. Manages NAT connections for a Linecard (LC) in SPP Creates NAT connections: Manages UDP, TCP ports and ICMP IDs on a per-interface bases
E N D
SPP Version 1NAT Daemon (natd) Mart Haitjema
NATD Overview • Manages NAT connections for a Linecard (LC) in SPP • Creates NAT connections: • Manages UDP, TCP ports and ICMP IDs on a per-interface bases • Translates board’s (GPE or CP) UDP/TCP port # or ICMP ID to an interface’s externally visible port or ICMP ID • Enables connection by installing an ingress and egress filter in LC’s TCAM • Tracks connection state: • UDP/ICMP: by hardware activity monitoring using TCAM aging bits (see Aging) • TCP: by tracking connection state (see TCP State Machine) • Removes connections: • Removes inactive UDP/ICMP connections whose filters have timed out • Removes stale TCP connections that have timed out in a particular state • Disables connection by removing ingress and egress filter for connection • Supported NAT Connections: • Connections initiated from a board in SPP • UDP - identified by two tuple, maps to public UDP port • [board MAC, board port] -> public port • TCP - identified by 4 tuple, maps to public TCP port • [board MAC, board port, remote IP, remote port] –> public port • ICMP echo-request (ping) - identified by 2 tuple, maps to public ICMP ID • [board MAC, board ICMP ID] -> public ID
NATD Overview • Daemon can reside anywhere • Intended to run on LC Ingress XScale for performance • Interacts with: • SCD • Sends packet meta-data for NAT from datapath to natd • natd sends back updated meta-data and instructs SCD to forward, drop, or ignore packet • Receives write and remove filter instructions from natd • Ingress SCD: • Polls TCAM for filters that have timed out see “Aging” • Informs natd of timed out filters • SRM • Determines queue/scheduler for NAT to use on each link (board-interface mapping) see “Links” • natd queries for this information at startup • Flow stats • natd informs flow stats of new/removed NAT connections
TCAM TCAM NAT Message Exchange INGRESS EGRESS SCD SCD EGRESS PCI BUS timed_out_filters nat_filters write_fltr nat_ingress rem_fltr_by_fid nat_egress NATD XScale XScale Line card Egress SCD to NATD nat_egress: process packet requiring NAT from egress Ingress SCD to NATD nat_ingress: process packet requiring NAT from ingress timed_out_filters: the following filter IDs have timed out through aging NATD to Ingress SCD nat_filters: tells SCD which filter IDs to use aging with write_fltr: install a filter for NAT in LC’s TCAM rem_fltr_by_fid: remove a NAT filter from LC’s TCAM NATD to SRM: get_sched_map: get queue/scheduler information for use by NAT connections get_sched_map Control Processor (CP) SRM
NATD Interface • result egress_natd(meta-data) valBuf_t meta-data dw4_t words[8];//the meta-data as defined on meta-data slides valBuf_t result { dw4_t retCode; // code to scd to drop, forward, or ignore packet dw4_t words[7]; // updated meta-data as defined on meta-data slides } • Sends packet meta-data to natd so natd can manage state for packet’s connection. Natd returns updated meta-data with instruction for SCD to drop, forward, or ignore packet • result ingress_natd(meta-data) valBuf_t meta-data dw4_t words[8];//the meta-data as defined on meta-data slides valBuf_t result { dw4_t retCode; // code to natd to drop, forward, or ignore packet dw4_t words[6]; // updated meta-data as defined on meta-data slides } • Sends packet meta-data to natd so natd can manage state for packet’s connection. Natd returns updated meta-data with instruction for SCD to drop, forward, or ignore packet
NATD Interface • status timed_out_filters(ingStartFid, numIngFids, egrStartFid, numEgrFids, ingFids, egrFids) dw4_t ingStartFid //start of range of ingress filters dw4_t egrStartFid // “ egress filters dw4_t numIngFids //number of filters polled in ingress DB dw4_t numEgrFids // “ “ egress DB valBuf_t ingFids { dw4_t fids[] // list of timed out filter IDs in ingress DB } valBuf_t egrFids { dw4_t fids[]// “ “ “ egress DB } • Sets/clears the timeout flag for all the filters that natd has state for in the range of the filters specified for each database • See “Aging” for how call is used
Links • NAT Traffic: • Routed across links • One link between each SPP board and LC interface • Link specifies which queue manager, scheduler, queue, and VLAN should be used to route traffic both in and out of the LC • Mappings are retrieved at startup by querying the SRM using the get_sched_map(...) call • See http://www.arl.wustl.edu/projects/TeN/ppt/srm.ppt
SCD Changes • Both SCDs: • New thread • Periodically (10ms) polls for packets in datapath to XScale scratch ring • Sends packet meta-data to natd to process • nat_ingress(...) call for ingress • nat_egress(...) call for egress • natd returns • updated meta-data if packet needs to be forwarded • instruction to drop, forward or ignore packet • If hit bit is not set, XScale has a copy of the packet and must either drop or forward the packet • Ingress only: • Starts when natd calls nat_filters(…) on ingress SCD • Periodically checks TCAM activity bits for nat filters (see Aging) • Uses timed_out_filters(...) to inform natd which filters have timed out and which have not
Flags (8b) Buf Handle(24b) Reserved (8b) Flags (8b) Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) SrcMAC (8b) Rsv (4b) Intf (4b) IP Pkt Length (16b) Eth Hdr Len (8b) IP_SAddr (32b) IP DAddr (32b) IP Proto (8b) TCP/UDP SPort Or ICMP ID (16b) ICMP Type(8b) Protocol (8b) TCP/UDP DPort Or ICMP ID (16b) ICMP Type (8b) IP_SAddr (32b) IP_DAddr (32b) IP Hdr 1st Word (32b) IP Hdr 1st Word (32b) IP Hdr Top 16 bits Of 2nd Word (16b) TCP/UDP SPort (16b) IP Hdr Top 16 bits Of 2nd Word (16b) TCP/UDP DPort (16b) TCAM Hit Index (32b) TCAM Hit Index (32b) SCD to NATD: Packet meta-data URG SYN Egress: ACK PSH RST Ingress: FIN URG SYN ACK PSH RST FIN U 1b A 1b P 1b R 1b S 1b F 1b Hit U 1b A 1b P 1b R 1b S 1b F 1b Hit H 1b Rsvd 1b TCP Flags 6b H 1b Rsvd 3b TCP Flags 6b TCP State on XScale uses Full 5-tuple TCP state Updates Include TCAM Hit Index From: http://www.arl.wustl.edu/projects/techX/design/SPP/SPP_V1_NAT_design.ppt
VLAN (12b) VLAN (12b) QM 2b QM 2b Sch 3b Sch 3b PerSchedQID (15b) PerSchedQID (15b) Reserved 3b Reserved 3b T 1b T 1b U 1b U 1b I 1b I 1b N 1b N 1b H 1b H 1b Translated SPort(16b) Stats Index (16b) Hit Hit NAT NAT UDP UDP TCP TCP ICMP ICMP Flags (8b) Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) IP DAddr (32b) IP Hdr 1st Word (32b) IP Hdr Top 16 bits Of 2nd Word (16b) Reserved (16b) Translated DPort/ID (16b) Stats Index (16b) NATD to SCD: updated meta-data Egress: Ingress: Flags (8b) Buf Handle(24b) Eth Hdr Len (8b) Reserved (8b) IP Pkt Length (16b) IP Hdr 1st Word (32b) IP Hdr Top 16 bits Of 2nd Word (16b) Reserved (16b) • Natd updates fields in dark blue • Flags: • H: HIT - Lookup was a valid hit. • N: NAT - NAT translation is required • I: ICMP - ICMP pkt • U: UDP - UDP pkt • T: TCP - TCP pkt • At most one of I/U/T should be set at any time • If N is 0, then I/U/T will be ignored • HF does not need to do any protocol specific operations for packets that do not require NAT translation • No need to send any H=0 pkts to HF. From: http://www.arl.wustl.edu/projects/techX/design/SPP/SPP_V1_NAT_design.ppt
NATD – Top Level • Single threaded, uses event queue for timed events • On start up retrieves scheduler information for board/interface mappings from srm using get_sched_map(...) call • Main loop: • Process messages from SCDs until next scheduled timeout event • i.e. nat_ingress(...), nat_egress(...), and timed_out_filters(...) • Installs and removes connections by calling write_fltr(...) and rem_fltr_by_fid(...) on Ingress SCD • Service timeout events • Events to remove UDP/ICMP connections with timed out filters • Events to remove stale TCP connections • See slides on Timeout Events
TCAM SCR SCR NN New NAT connection example nat_ingres/egress NATD SCD Poll for packets natd response drop/forward/ignore install egress filter write_fltr(...) Packet meta-data install ingress filter write_fltr(...) updated meta-data XScale Datapath Lookup Hdr Format install filter
Table Structure ingressFilter filterTable tcpConnection EgressFilter tcpTable natTable IP Address: XXX.XXX.XXX.XXX Ifn: X tcpTable ingressFilter udpConnection EgressFilter icmpTable ingressFilter icmpConnection EgressFilter • One NAT Table per interface • All NAT tables share a pool of filters from the FilterTable
TCP State Machine rst 5 rst 5 rst INGRESS CLOSED 5 fin (ingress) fin (egress) 3 syn syn ack SYN-WAIT NULL ESTABLISHED FIN-WAIT 1 2 fin (egress) fin (ingress) rst EGRESS CLOSED 3 fin (egress) 5 2 syn syn 4 rst 4 5
Timeout Events TCP • TCP Timeouts • All timeouts remove connection when they fire • tcpSynTout: • Period: 5 minutes • Installed when connection transitions to SYN-WAIT state • Removed when connection transitions to established state • tcpIdleTout: • Period: 24 hours • Installed when connection transitions to ESTABLISHED state • Removed when connection transitions to FIN-WAIT state • tcpFinTout: • Period: 5 minutes • Installed when connection transitions to FIN-WAIT state • Removed if connection is closed
Timeout Events UDP/ICMP • UDP & ICMP Timeouts • udpAgeTout / icmpAgeTout • Period: 5 minutes • Remove connection if both ingress & egress filter for connection have timed out
Aging • Hardware Aging: • Uses TCAM’s hardware activity bits • See “TCAM and Aging” in http://www.arl.wustl.edu/projects/techX/design/SPP/SPP_V1_NAT_design.ppt • Algorithm: • SCD • Polls TCAM for filters that have timed out • Uses the range of filter IDs specified by nat_filters(…) call. Range must be a multiple of 32 • Calls IdtSearchDatabaseSwAgeAndGetAgedEntries(...) to get timed out filters in subset of range of filter IDs in each database • Checks entire range of nat filters every 5 minutes • Checks the same range of filter IDs in ingress & egress database at the same time • Informs natd which filters have timed out in each range via timed_out_filters(…) call • Natd • Updates state of each filter in range of filters specified in timed_out_filters(...) • For each filter in specified range • Sets timed out flag associated with filter SCD clears timed out flag associated with each filter natd has state for • Each UDP/ICMP connection has a timeout event that fires every 5 minutes • if both filters have timed out, connection removed
Status • To do: • Finish TCAM aging – need to debug IDT call - FINISHED • Fix eventManager to allow events on queue to be removed • Send connection information to flow stats • Implement hash functions for faster connection state lookup • Open issues • Burst of UDP packets not handled well
File Structure • techX repository: wu_arl/dnet/npe/natd • Files: • bitmap.{cc,h} • bitmap/portmap class used for managing freelist of available ports/IDs • boards.{cc,h} • defines board & link classes • connections.{cc,h} • defines ICMP, UDP, and TCP connection data structures • events.{cc,h} • all timeout events • filters.{cc,h} • filter code and filter table • includes calls to SCD to install/uninstall filters • natd.{cc,h} • reads configuration file, gets scheduler mappings from SRM, includes main processing loop • statOp.{cc,h} • code for natd interface calls [egress_nat(...), ingress_nat(...), and timed_out_filters(...)] • tables.{cc,h} • defines all table data structures [natTable, icmpTable, udpTable, and tcpTable] • manages all connection state (e.g. open/close connection, TCP state transitions, etc)
Configuration File Format myAddr = 0 natd’s address myPort = 5050 natd’s port scdAddr = 0 scd’s address scdPort = 7070 scd’s port srmAddr = 192.168.32.2 srm’s address srmPort = 6060 srm’s port loglvl = Loud logging verbosity [GeneralParameters] tcpSynTimeOut = 300 timeout in syn-wait state tcpFinTimeOut = 300 timeout in fin-wait state tcpIdleTimeOut = 86400 timeout in established state agingPollInterval = 300 period for udp/icmp timeout ingressStartFid = 0 first filter ID reserved for nat in ingress DB ingressEndFid = 8191 last filter ID reserved for nat in ingress DB (range must be a multiple of 32) egressStartFid = 0 “ “ “ egress DB egressEndFid = 8191 “ “ “ egress DB (currently range must be same as ingress) [ Interface ] defined for each interface # Link name drn05 ifn = 0 interface number IPAddress = 0x80fc99d1 interface’s IP address udpStartPort = 30000 first udp port reserved for nat udpEndPort = 30499 last udp port reserved for nat tcpStartPort = 30000 “ tcp “ tcpEndPort = 30499 “ tcp “ icmpStartID = 0 “ icmp “ icmpEndID = 65535 “ icmp “ [ Board ] defined for each board # cp1, Slot 0 type=cp CP or GPE (not currently used) MACAddress = 00:1E:C9:FE:76:23 board’s MAC address