1 / 25

Transport Layer: UDP

Transport Layer: UDP. COMS W6998 Spring 2010 Erich Nahum. Outline. UDP Layer Architecture Receive Path Send Path. Recall what UDP Does. RFC 768 IP Proto 17 Connectionless Unreliable Datagram Supports multicast Optional checksum Nice and simple. Yet still 2187 lines of code!.

cicada
Download Presentation

Transport Layer: UDP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transport Layer: UDP COMS W6998 Spring 2010 Erich Nahum

  2. Outline • UDP Layer Architecture • Receive Path • Send Path

  3. Recall what UDP Does • RFC 768 • IP Proto 17 • Connectionless • Unreliable • Datagram • Supports multicast • Optional checksum • Nice and simple. • Yet still 2187 lines of code! UDP packet format 0 15 31 3 7 Source Port (16) Destination Port (16) Length (16) Checksum (16) Data

  4. UDP Header The udp header: include/linux/udp.h struct udphdr { __be16 source; __be16 dest; __be16 len; __sum16 check; };

  5. Sidebar: UDP-Lite • RFC 3828 • Very similar to UDP • Difference is checksum covers part of packet rather than all • Checksum coverage says how many bytes (starting from header) are covered by checksum • Idea is certain apps would rather have a damaged packet than none • Examples are audio, video codecs • IP Protocol 136 • Linux UDP-Lite implementation shares most code with UDP UDP packet format 0 15 31 3 7 Source Port (16) Destination Port (16) Checksum Coverage (16) Checksum (16) Data

  6. Sources of UDP Packets • Packets arrive on an interface and are passed to the udp_rcv() function. • UDP packets are packed into an IP packet and passed down to IP via ip_append_data() and ip_push_pending_frames()

  7. UDP Implementation Design Higher Layers socket.c sock.c sock_sendmsg sock_queue_rcv_skb udp.c udp.c ICMP icmp_send __udp_queue_rcv_skb ROUTING udp_sendmsg ip_route_output_flow __udp4_lib_lookup_skb __udp4_lib_rcv MULTICAST udp_push_pending_frames __udp4_lib_mcast_deliver udp_rcv Ip_output.c Ip_input.c ip_push_pending_frames ip_append_data ip_local_deliver_finish

  8. UDP Proto struct proto udp_prot = { .name = "UDP", .owner = THIS_MODULE, .close = udp_lib_close, .connect = ip4_datagram_connect, .disconnect = udp_disconnect, .ioctl = udp_ioctl, .destroy = udp_destroy_sock, .setsockopt = udp_setsockopt, .getsockopt = udp_getsockopt, .sendmsg = udp_sendmsg, .recvmsg = udp_recvmsg, .sendpage = udp_sendpage, .backlog_rcv = __udp_queue_rcv_skb, .hash = udp_lib_hash, .unhash = udp_lib_unhash, .get_port = udp_v4_get_port, .memory_allocated = &udp_memory_allocated, .sysctl_mem = sysctl_udp_mem, .sysctl_wmem = &sysctl_udp_wmem_min, .sysctl_rmem = &sysctl_udp_rmem_min, .obj_size = sizeof(struct udp_sock), .slab_flags = SLAB_DESTROY_BY_RCU, .h.udp_table = &udp_table, };

  9. udp_table /** * struct udp_table - UDP table * * @hash: hash table, sockets are hashed on (local port) * @hash2: hash table, sockets are hashed on (local port, local address) * @mask: number of slots in hash tables, minus 1 * @log: log2(number of slots in hash table) */ struct udp_table { struct udp_hslot *hash; struct udp_hslot *hash2; unsigned int mask; unsigned int log; }; udp_table_init() allocates the hash tables, initializes them: for (i = 0; i <= table->mask; i++) { INIT_HLIST_NULLS_HEAD(&table->hash[i].head, i); table->hash[i].count = 0; spin_lock_init(&table->hash[i].lock); }

  10. Outline • UDP Layer Architecture • Receive Path • Send Path

  11. Receiving packets in UDP • From user space, you can receive udp traffic with three system calls: • recv() (when the socket is connected). • recvfrom() • recvmsg() • All three are handled by udp_rcv() in the kernel.

  12. Recall IP’s inet_protos net_protocol udp_rcv() 0 handler inet_protos[MAX_INET_PROTOS] udp_err() err_handler gso_send_check gso_segment gro_receive gro_complete net_protocol igmp_rcv() 1 handler Null err_handler gso_send_check gso_segment gro_receive gro_complete net_protocol MAX_INET_PROTOS

  13. Receive Path: udp_rcv Higher Layers sock.c • Calls __udp4_lib_rcv(skb, &udp_table, IPPROTO_UDP); • Function is used by both UDP and UDP-Lite sock_queue_rcv_skb udp.c ICMP icmp_send __udp_queue_rcv_skb __udp4_lib_lookup_skb __udp4_lib_rcv MULTICAST __udp4_lib_mcast_deliver udp_rcv Ip_input.c ip_local_deliver_finish

  14. Receive: __udp4_lib_rcv Higher Layers sock.c sock_queue_rcv_skb • Looks up the route table from the skb • Checks that skb has a header • Checks that length is good • Calcs the checksum • Pulls out saddr, daddr • Checks if address is multicast • Calls __udp4_lib_mcast_deliver() udp.c ICMP icmp_send __udp_queue_rcv_skb __udp4_lib_lookup_skb __udp4_lib_rcv MULTICAST __udp4_lib_mcast_deliver udp_rcv Ip_input.c ip_local_deliver_finish

  15. Receive: __udp4_lib_rcv (cont) Higher Layers sock.c • Looks up the socket in the udptable • Via __udp4_lib_lookup_skb() • Increases refcount on the sk (socket) • If socket is found • Calls __udp_queue_rcv_skb() • Decrements refcount with sock_put(sk) • If not, • Send ICMP_UNREACHABLE • Drop packet. sock_queue_rcv_skb udp.c ICMP icmp_send __udp_queue_rcv_skb __udp4_lib_lookup_skb __udp4_lib_rcv MULTICAST __udp4_lib_mcast_deliver udp_rcv Ip_input.c ip_local_deliver_finish

  16. Recv: __udp_queue_rcv_skb Higher Layers sock.c • Calls sock_queue_rcv_skb • Increments some statistics sock_queue_rcv_skb udp.c ICMP icmp_send __udp_queue_rcv_skb __udp4_lib_lookup_skb __udp4_lib_rcv MULTICAST __udp4_lib_mcast_deliver udp_rcv Ip_input.c ip_local_deliver_finish

  17. Outline • IP Layer Architecture • Receive Path • Send Path

  18. Sending packets in UDP • From user space, you can send udp traffic with three system calls: • send() (when the socket is connected). • sendto() • sendmsg() • All three are handled by udp_sendmsg() in the kernel. • udp_sendmsg() is much simpler than the tcp parallel method , tcp_sendmsg(). • udp_sendpage() is called when user space calls sendfile() (to copy a file into a udp socket). • sendfile() can be used also to copy data between one file descriptor and another. • udp_sendpage() invokes udp_sendmsg().

  19. UDP Socket Options • For IPPROTO_UDP/SOL_UDP level, there exists a socket option UDP_CORK • Added in Linux kernel 2.5.44. int state=1; setsockopt(s, IPPROTO_UDP, UDP_CORK, &state, sizeof(state)); for (j=1;j<1000;j++) sendto(s,buf1,...) state=0; setsockopt(s, IPPROTO_UDP, UDP_CORK, &state, sizeof(state));

  20. UDP_CORK (cont) • The above code fragment will call udp_sendmsg() 1000 times without actually sending anything on the wire (in the usual case, when without setsockopt() with UDP_CORK, 1000 packets will be sent). • Only after the second setsockopt() is called, with UDP_CORK and state=0, one packet is sent on the wire. • Kernel implementation: when using UDP_CORK, udp_sendmsg() passes MSG_MORE to ip_append_data(). • UDP_CORK is not in glibc, you need to add it to your program: #define UDP_CORK 1

  21. Send Path: udp_sendmsg() Higher Layers socket.c • Checks length, MSG_OOB • Checks if there are frames pending • If so, jump to do_append_data • Gets the address • Checks if socket is connected • If so, pull routing info out of sk • Otherwise, look up via ip_route_output_flow() • Calls ip_append_data() • Handles fragmentation • Calls udp_push_pending_frames() sock_sendmsg udp.c ROUTING udp_sendmsg ip_route_output_flow udp_push_pending_frames Ip_output.c ip_push_pending_frames ip_append_data

  22. udp_push_pending_frames() Higher Layers socket.c • Checks that there is room in the skb via skb_peek() • If not, goto out and bail • Creates UDP header • Checksums if necessary (or partially for UDP-Lite) • Calls ip_push_pending_frames() • Combines all pending IP fragments on the socket as one IP datagram and sends it out sock_sendmsg udp.c ROUTING udp_sendmsg ip_route_output_flow udp_push_pending_frames Ip_output.c ip_push_pending_frames ip_append_data

  23. UDP Backup

  24. Recall the sk_buff structure sk_buff next sk_buff sk_buff_head prev sk tstamp net_device dev struct sock ...lots.. ...of.. Packetdata ...stuff.. ``headroom‘‘ transport_header network_header MAC-Header mac_header IP-Header head UDP-Header data UDP-Data tail ``tailroom‘‘ end dataref: 1 truesize nr_frags users skb_shared_info ... destructor_arg linux-2.6.31/include/linux/skbuff.h

  25. Recall pkt_type in sk_buff • pkt_type: specifies the type of a packet • PACKET_HOST: a packet sent to the local host • PACKET_BROADCAST: a broadcast packet • PACKET_MULTICAST: a multicast packet • PACKET_OTHERHOST:a packet not destined for the local host, but received in the promiscuous mode. • PACKET_OUTGOING: a packet leaving the host • PACKET_LOOKBACK: a packet sent by the local host to itself.

More Related