380 likes | 502 Views
Reliable Sockets: A Foundation for Mobile Communications. Victor C. Zandy Computer Sciences Department University of Wisconsin-Madison. Motivation. Network communication is unreliable Modems disconnect spontaneously Computers run on batteries Many IP addresses are not static
E N D
Reliable Sockets:A Foundation for Mobile Communications Victor C. Zandy Computer Sciences Department University of Wisconsin-Madison Paradyn/Condor Week (March 2001, Madison WI)
Motivation • Network communication is unreliable • Modems disconnect spontaneously • Computers run on batteries • Many IP addresses are not static • Assignment by DHCP • Mobile computers move across networks • Applications do not respond well to these failures
Reliable Sockets (Rocks) • Sockets that tolerate • IP address changes • Link failures • Extended periods of disconnection • Automatically detect failures and recover • No loss of in-flight data • Applications are oblivious to failures
Rocks are General Purpose • Rocks can be used for • UDP and TCP (and everything over them) • Connected sockets and listening sockets • Interoperate with plain sockets • Transparent, user-level, and portable
Applications • Remote shells • Mail, editor • Long-running builds • Remote GUI-based applications • Office apps • Mobile and reliable UDP • Streaming video and audio
Applications • Process migration • Checkpoint Condor jobs with open sockets • Migrate desktop applications
Related Work • Emphasize mobility, not reliability • No extended periods of disconnection • Lack mechanisms for failure detection and automatic reconnection • Based on kernel modifications • Must be root to install • Unportable • Protocol internals • Mobile IP, TCP Migrate, MSOCKS
TCP Sockets Host A Application Sockets API TCP Socket Send Kernel Recv Port 10000 IP: 128.1.2.3 Network
4 5 1 1 2 2 3 3 TCP Data Flow Host A Host B write Sockets API Sockets API Send Send Recv Recv Port Port 10000 22 IP: 128.1.2.3 IP: 144.0.1.1
4 4 5 5 1 1 2 2 3 3 TCP Data Flow Host A Host B write Sockets API Sockets API Send Send Recv Recv Port Port 10000 22 IP: 128.1.2.3 IP: 144.0.1.1
4 4 5 5 1 1 2 2 3 3 In-flight data TCP Data Flow Host A Host B write Sockets API Sockets API Send Send Recv Recv Port Port 10000 22 IP: 128.1.2.3 IP: 144.0.1.1
4 4 5 5 1 1 1 2 2 2 3 3 3 TCP Data Flow Host A Host B write read Sockets API Sockets API Send Send Recv Recv Port Port 10000 22 IP: 128.1.2.3 IP: 144.0.1.1
New IP Address • Host movement • Lease expiry • Process migration • Disconnection • Host suspension • Link failure Socket Failures Host A Host B Sockets API Sockets API Send Send Recv Recv Port Port 10000 22 IP: 128.1.2.3 IP: 144.0.1.1
Sockets API calls fail write read In-flight data is lost Effect on Applications Host A Host B Sockets API Sockets API Send Send Recv Recv Port Port 10000 22 IP: 128.1.2.3 IP: 144.0.1.1
What Rocks Do • Detect socket failure • Hide failure from the application • Automatically reconnect • Recover in-flight data
Reliable Sockets Host A Application Sockets API Rock In-Flight Rocks Library Sockets API TCP Socket Send Kernel Recv Port 10000 IP: 128.1.2.3 Network
Rock Data Flow Host A Host B read write Sockets API Sockets API In-Flight In-Flight Count bytes read. Copy data. Count bytes sent. Sockets API Sockets API Send Send Recv Recv Port Port 22 10000 IP: 144.0.1.1 IP: 128.1.2.3
Response to Failure Host A Host B write Sockets API Sockets API In-Flight In-Flight Sockets API Sockets API Send Send Recv Recv Port Port 22 10000 IP: 144.0.1.1 IP: 128.1.2.3
Response to Failure Host A Host B Sockets API Sockets API In-Flight In-Flight Sockets API Sockets API Send Send Recv Recv Port Port 22 10000 IP: 144.0.1.1 IP: 128.1.2.3
Response to Failure Host A Host B Sockets API Sockets API Each rock detects the failure within seconds. In-Flight In-Flight ! ! Sockets API Sockets API Send Send Recv Recv Port Port 22 10000 IP: 144.0.1.1 IP: 128.1.2.3
Response to Failure Host A Host B Sockets API Sockets API • Each rock suspends: • Close TCP socket • Block application • Attempt to reconnect In-Flight In-Flight Sockets API Sockets API IP: 144.0.1.1 IP: 128.1.2.3
Response to Failure Host A Host B Sockets API Sockets API • Each rock suspends: • Close TCP socket • Block application • Attempt to reconnect In-Flight In-Flight Sockets API Sockets API New IP Address IP: 144.0.1.1 IP: 207.10.0.1
Recovery Host A Host B Sockets API Sockets API In-Flight In-Flight Sockets API Sockets API Send Send New TCP Connection Recv Recv Port Port 22 30001 IP: 144.0.1.1 IP: 207.10.0.1
Recovery Host A Host B Sockets API Sockets API Authenticate. In-Flight In-Flight Sockets API Sockets API Send Send Recv Recv Port Port 22 30001 IP: 144.0.1.1 IP: 207.10.0.1
Recovery Host A Host B Sockets API Sockets API Authenticate. Retransmit in-flight data not received by remote application. In-Flight In-Flight Sockets API Sockets API Send Send Recv Recv Port Port 22 30001 IP: 144.0.1.1 IP: 207.10.0.1
Recovery Host A Host B read Sockets API Sockets API Authenticate. Retransmit in-flight data not received by remote application. Then resume the rock. In-Flight In-Flight Sockets API Sockets API Send Send Recv Recv Port Port 22 30001 IP: 144.0.1.1 IP: 207.10.0.1
Reconnection Host A Host B 144.0.1.1 128.1.2.3
Reconnection Connection end moves to new IP address Host B Host A Change IP Address 144.0.1.1 207.10.0.1
Reconnection Each end attempts to reconnect to its peer at its last known address. Host B Host A Connection does not complete 144.0.1.1 207.10.0.1
Reconnection As long as one end does not move, they eventually reconnect. Host B Host A 144.0.1.1 207.10.0.1
Reconnection They cannot reconnect if both ends move. Host B Host A Connection does not complete Connection does not complete 101.8.7.1 207.10.0.1
Reconnection Network Proxy Host B Host A Where is B? Where is A? 101.8.7.1 207.10.0.1
Expanded Rocks API • API allows rocks-aware applications to control rocks behavior • Fine control of reconnection • Notification when rock is suspended • Manual control of reconnection addresses • Notification when rock is resumed
Expanded Rocks API • New socket options • Extended getsockopt and setsockopt • Policies • Which ports are excluded? • Parameters • Reconnection timeout • Sensitivity to connection failures
Performance • Reconnection latency • 1-2 seconds to reconnect • Usually less than time to acquire DHCP lease • Suspended rocks have negligible overhead
Conclusion • Rocks make sockets completely reliable • Protect from link failures and IP address changes • Use with any application • Our release is ready for download • Ready for remote shells and remote GUIs • http://www.cs.wisc.edu/~zandy/rocks • See the demo on Wednesday!
Detecting Failures • Users expect quick response to failures. • Heartbeat: • Periodically send heartbeat to peer • Watch for too many missed heartbeats • Sockets API Errors: • Too slow to rely upon • Not reported for idle connections
Detecting Failures • The TCP keep-alive probe is inadequate • It waits two hours to send its first probe • User cannot change its period