1 / 36

OS support for (flexible) high-speed networking

uspace. u. k. n. kspace. nspace. OS support for (flexible) high-speed networking. Herbert Bos. Vrije Universiteit Amsterdam. CPU. MEM. NIC. we’re hurting because of. hardware problems software problems language problems lack of flexibility network speed may grow too fast anyway

Download Presentation

OS support for (flexible) high-speed networking

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. uspace u k n kspace nspace OS support for (flexible) high-speed networking Herbert Bos Vrije Universiteit Amsterdam

  2. CPU MEM NIC we’re hurting because of • hardware problems • software problems • language problems • lack of flexibility • network speed may grow too fast anyway • multiple apps problem

  3. What are we building? different barriers flows

  4. uspace kernel What are we building? if (pkt.proto == TCP) then if (http) then scan webattacks; else if (smtp) scan spam;scan viruses fi else if (pkt.proto == UDP) then if (NFS traffic) mem = statistics (pkt); else scan SLAMMER; fi • reconfigure when load changes • dynamic optical infrastructure: • - configure at startup time- construct datapaths

  5. - no ‘vertical’ copies - no ‘horizontal’ copies within flow group - more than ‘just filtering’ in kernel (e.g.,statistics) Application B ‘filter’ reduce copying • FFPF avoids both ‘horizontal’ and ‘vertical’ copies Application A U K

  6. minimise copying, context switching flowgraph = DAG different copying regimes endianness new languages combine multiple requests control  dataplane different buffer regimes

  7. Software structure applications FFPF toolkit MAPI FFPF userspace libpcap userspace-kernel boundary FFPF kernelspace host- NIC boundary FFPF ixp2xxx

  8. current status • working for Linux PC with • NICs • IXP1200s and IXP2400s (plugged in) • remote IXP2850 (with poetic license) • rudimentary support for distributed packet processor • speed comparable to tailored solutions

  9. Thank you! http://ffpf.sourceforge.net/

  10. bridging barriers

  11. Three flavours • Three flavours of packet processing on IXP & host • Regular • copy only when needed • may be slow depending on access pattern • Zero copy • never copy • may be slow depending on access pattern • Copy once • copy always 11

  12. combining filters (dev,expr=/dev/ixp0) >[ [ (BPF,expr=“…”) >(PKTCOUNT) ] | (FPL2,expr=“…”) ]> (PKTCOUNT)

  13. Example 1: writing applications • int main(int argc, char** argv) { • void *opaque_handle; • intcount, returnval; • /** pass the string to the subsystem. It initializes and returns an opaque pointer */ • opaque_handle = ffpf_open(argv[1], strlen(argv[1])); • if(start_listener(asynchronous_listener, NULL)) • { • printf (WARNING,“spawn failed\n"); • ffpf_close (opaque_handle); return -1; • } • count = ffpf_process(FFPF_PROCESS_FOREVER); • stop_listener(1); • returnval = ffpf_close(opaque_handle); • printf(,"processed %d packets\n",count); • return returnval; • }

  14. Example 1: writing applications • void* asynchronous_listener(void* arg) { • int i; • while (!should_stop()) { • sleep(5); • for (i=0;i<export_len;i++) { • printf("flowgrabber %d: read %d, written%d\n", i, get_stats_read(exported[i]), get_stats_processed(exported[i])); • while ((pkt =get_packet_next(exported[i]))) { • dump_pkt(pkt); • } • } • } • return NULL; • }

  15. Example 2: FPL-2 • new pkt processing language: FPL-2 • for IXP, kernel and userspace • simple, efficient and flexible • simple example: filter all webtraffic • IF ( (PKT.IP_PROTO == PROTO_TCP) && (PKT.TCP_PORT == 80)) THEN RETURN 1; • more complex example: count pkts in all TCP flows • IF (PKT.IP_PROTO == PROTO_TCP) THEN R[0] = Hash[ 14, 12, 1024]; M[ R[0] ]++;FI

  16. FPL-2 • all common arithmetic and bitwise operations • all common logical ops • all common integer types • for packet • for buffer ( useful for keeping state!) • statements • Hash • External functions • to call hardware implementations • to call fast C implementations • If … then … else • For … break; … Rof • Return

  17. Example application: dynamic ports 1. // R[0] stores no. of dynports found (initially 0) 2. IF (PKT.IP_PROTO==PROTO_TCP) THEN 3. IF (PKT.TCP_DPORT==554) THEN 4. M[R[0]]=EXTERN("GetDynTCPDPortFromRTSP",0,0); 5. R[0]++; 6. ELSE // compare pkt’s dst port to all ports in array – if match, return pkt 7. FOR (R[1]=0; R[1] < R[0]; R[1]++) 8. IF (PKT.TCP_DPORT == M[ R[1] ] ) THEN 9. RETURN TRUE; 10. FI 11. ROF 11. FI 12. FI 12. RETURN FALSE;

  18. efficient userspace ? • reduced copying and context switches • sharing data • flowgraphs: sharing computations kernel ? network card ? ? x “push filtering tasks as far down the processing hierarchy as possible”

  19. spread of SAPPHIRE in 30 minutes -process at lowest possible level-minimise copying -minimise context switching -freedom at the bottom Network Monitoring • Increasingly important • traffic characterisation, securitytraffic engineering, SLAs, billing, etc. • Existing solutions: • designed for slow networksor traffic engineering/QoS • not very flexible • We’re hurting because of • hardware (bus, memory) • software (copies, context switches)  demand for solution:-scales to high link rates - scales in no. of apps - flexible

  20. UDP with CodeRed eth0 TCP SYN bytecount HTTP U RTSP TCP IP “contains worm” UDP RTP UID 0 generalised notion of flow Flow: “a stream of packets that match arbitraryuser criteria”  Flowgraph

  21. Application B ‘filter’ reduce copying • FFPF avoids both ‘horizontal’ and ‘vertical’ copies Application A U K

  22. Extensible • modular framework • language agnostic • plug-in filters (device,eth0) -> (sampler,2) -> (BPF,”..”) -> (packetcount) (device,eth0) | (device,eth1) -> (sampler,2) -> (FPL-2,”..”) | (BPF,”..”) -> (bytecount)

  23. R O O O O O O O W Buffers • PacketBuf • circular buffer with N fixed-size slots • large enough to hold packet • IndexBuf • circular buffer with N slots • contains classification result + pointer

  24. Buffers O O O • PacketBuf • circular buffer with N fixed-size slots • large enough to hold packet • IndexBuf • circular buffer with N slots • contains classification result + pointer O O O O W R

  25. Buffers X X X • PacketBuf • circular buffer with N fixed-size slots • large enough to hold packet • IndexBuf • circular buffer with N slots • contains classification result + pointer X X O O W R

  26. R1 Buffer management O O O O O O O O O O  what to do if writer catches up with slowest reader? • slow reader preference • drop new packets (traditional way of dealing with this) • overall speed determined by slowest reader • fast reader preference • overwrite existing packets • application responsible for keeping up • can check that packets have been overwritten • different drop rates for different apps O O O O O O R1 W

  27. IF (PKT.IP_PROTO == PROTO_TCP) THEN // reg.0 = hash over flow fields R[0] = Hash (14,12,1024) // increment pkt counter at this // location in MBuf MEM[ R[0] ]++FI • simple to use • compiles to optimised native code • resource limited (e.g., restricted FOR loop) • access to persistent storage (scratch memory) • calls to external functions (e.g., fast C functions or hardware assists) • compiler for uspace, kspace, and nspace (ixp1200) Languages • FFPF is language neutral • Currently we support: • BPF • C • OKE Cyclone • FPL

  28. packet sources • currently three kinds implemented -netfilter -net_if_rx() -IXP1200 • implementation on IXPs: NIC-FIX -bottom of the processing hierarchy -eliminates mem & bus bottlenecks uspace kspace nspace

  29. zero copy copy once on-demand copy Network Processors “programmable NIC”

  30. Performance results pkt loss: FFPF: < 0.5% LSF: 2-3%

  31. Performance results pkt loss: LSF:64-75% FFPF: 10-15%

  32. Performance

  33. Summary concept of ‘flow’  generalised copying and context switching minimised processing in kernel/NIC  complex programs + ‘pipes’ FPL: FFPF Packet Languages  fast + flexible persistent storage  flow-specific state authorisation + third-party code any user flow groupsapplications sharing packet buffers

  34. Thank you! http://ffpf.sourceforge.net/

  35. microbenchmarks

  36. CPU MEM NIC we’re hurting because of • kernel: too little, too often • vertical + horizontal copies • context switching • popular ones: not expressive • expressive ones: not popular • same for speed • no way to mix and match • hardware problems • software problems • language problems • lack of flexibility • network speed may grow too fast anyway • multiple apps problem • heterogeneous hardware • programmable cards, routers, etc. • no easy way to combine • build on abstractions that are too hi-level • hard to use as ‘generic’ packet processor • 10  40Gbps: what will you do in 10 ns? • beyond capacity of single processor

More Related