320 likes | 500 Views
TCP-XM: Unicast-enabled Reliable Multicast. Karl Jeacle. karl.jeacle@cl.cam.ac.uk. Rationale. Would like to achieve high-speed bulk data delivery to multiple sites Multicasting would make sense Existing multicast research has focused on sending to a large number of receivers
E N D
TCP-XM: Unicast-enabled Reliable Multicast Karl Jeacle karl.jeacle@cl.cam.ac.uk
Rationale • Would like to achieve high-speed bulk data delivery to multiple sites • Multicasting would make sense • Existing multicast research has focused on sending to a large number of receivers • But Grid is an applied example where sending to a moderate number of receivers would be extremely beneficial
Multicast availability • Deployment is a problem! • Protocols have been defined and implemented • Valid concerns about scalability; much FUD • “chicken & egg” means limited coverage • Clouds of native multicast • But can’t reach all destinations via multicast • So applications abandon in favour of unicast • What if we could multicast when possible… • …but fall back to unicast when necessary?
Multicast TCP? • TCP • “single reliable stream between two hosts” • Multicast TCP • “multiple reliable streams from one to n hosts” • May seem a little odd, but there is precedent… • TCP-XMO – Liang & Cheriton • M-TCP – Mysore & Varghese • M/TCP – Visoottiviseth et al • PRMP – Barcellos et al • SCE – Talpade & Ammar
Building Multicast TCP • Want to test multicast/unicast TCP approach • But new protocol == kernel change • Widespread test deployment difficult • Build new TCP-like engine • Encapsulate packets in UDP • Run in userspace • Performance is sacrificed… • …but widespread testing now possible
Sending Application Receiving Application If natively implemented TCP TCP IP IP test deployment UDP UDP IP IP TCP/IP/UDP/IP
TCP engine • Where does initial TCP come from? • Could use BSD or Linux • Extracting from kernel could be problematic • More compact alternative • lwIP = Lightweight IP • Small but fully RFC-compliant TCP/IP stack • lwIP + multicast extensions = “TCP-XM”
TCP-XM overview • Primarily aimed at push applications • Sender initiated – advance knowledge of receivers • Opens sessions to n destination hosts simultaneously • Unicast is used when multicast not available • Options headers used to exchange multicast info • API changes • Sender incorporates multiple destination and group addresses • Receiver requires no changes • TCP friendly, by definition
TCP SYN SYNACK ACK Sender Receiver DATA ACK FIN ACK FIN ACK
TCP-XM Receiver 1 Sender Receiver 2 Receiver 3
TCP-XM connection • Connection • User connects to multiple unicast destinations • Multiple TCP PCBs created • Independent 3-way handshakes take place • SSM or random ASM group address allocated • (if not specified in advance by user/application) • Group address sent as TCP option • Ability to multicast depends on TCP option
TCP Group Option kind=50 len=6 Multicast Group Address 1 byte 1 byte 4 bytes • New group option sent in all TCP-XM SYN packets • Non TCP-XM hosts will ignore (no option in SYNACK) • Presence implies multicast capability • Sender will automatically revert to unicast
TCP-XM transmission • Data transfer • Data replicated/enqueued on all send queues • PCB variables dictate transmission mode • Data packets are multicast (if possible) • Retransmissions are unicast • Auto fall back/forward to unicast/multicast • Close • Connections closed as per unicast TCP
Fall back / fall forward • TCP-XM principle • “Multicast if possible, unicast when necessary” • Initial transmission mode is group unicast • Ensures successful initial data transfer • Fall forward to multicast on positive feedback • Typically after ~75K unicast data • Fall back to unicast on repeated mcast failure
Multicast feedback Option kind=51 len=3 % Multicast Packets Received 1 byte 1 byte 1 byte • New feedback option sent in ACKs from receiver • Only used between TCP-XM hosts • Indicates % of last n packets received via multicast • Used by sender to fall forward to multicast transmission
TCP-XM reception • Receiver • No API-level changes • Normal TCP listen • Auto-IGMP join on TCP-XM connect • Accepts data on both unicast/multicast ports • tcp_input() accepts: • packets addressed to existing unicast destination… • …but now also those addressed to multicast group • Tracks how last n segs received (u/m)
API changes • Only relevant if natively implemented! • Sender API changes • New connection type • Connect to port on array of destinations • Single write sends data to all hosts • TCP-XM in use: conn = netconn_new(NETCONN_TCPXM); netconn_connectxm(conn, remotedest, numdests, group, port); netconn_write(conn, data, len, …);
PCB changes • Every TCP connection has an associated Protocol Control Block (PCB) • TCP-XM adds: struct tcp_pcb { … struct tcp_pcb *firstpcb;/* first of the mpcbs */ struct tcp_pcb *nextm; /* next tcpxm pcb */ enum tx_mode txmode; /* unicasting or multicasting */ u8_t nrtxm; /* number of retransmits for multicast */ u32_t nrtxmtime; /* time since last retransmit */ u32_t mbytessent; /* total bytes sent via multicast */ u32_t mbytesrcvd; /* total bytes received via multicast */ u32_t ubytessent; /* total bytes sent via unicast */ u32_t ubytesrcvd; /* total bytes received via unicast */ struct segrcv msegrcv[128]; /* ismcast boolean for last n segs */ u8_t msegrcvper; /* % of last segs received via mcast */ u8_t msegrcvcnt; /* counter for segs recvd via mcast */ u8_t msegsntper; /* % of last segs delivered via mcast */ }
next next next next next U1 U2 M1 M3 M2 nextm nextm nextm nextm nextm Linking PCBs • *next points to the next TCP session • *nextm points to the next TCP session that’s part of a particular TCP-XM connection • Minimal timer and state machine changes
What happens to the cwin? • Multiple receivers • Multiple PCBs • Multiple congestion windows • Default to min(cwin) • i.e. send at rate of slowest receiver • Is this really so bad? • Compare to time taken for n unicast transfers
mcp & mcpd • Multicast file transfer application using TCP-XM • ‘mcpd &’ on servers • ‘mcp file host1 host2… hostN’ on client • http://www.cl.cam.ac.uk/~kj234/mcp/ • Full source code online • FreeBSD, Linux, Solaris
All done! • Thanks for listening! • Questions?