280 likes | 490 Views
DECUS Europe 2000. DHCP Failover Protocol. Thursday, 13 Apr 2000 9:00 - 9:45. Jeff Schreiber. schreiber@process.com. Outline. DHCP Basic Operation Existing forms of Redundancy Requirements for Failover Redundancy Problems, Goals, and Limitations How it works. Redundancies.
E N D
DECUS Europe 2000 DHCP Failover Protocol Thursday, 13 Apr 2000 9:00 - 9:45 Jeff Schreiber schreiber@process.com
Outline • DHCP Basic Operation • Existing forms of Redundancy • Requirements for Failover Redundancy • Problems, Goals, and Limitations • How it works
Redundancies • DNS: both Primary and Secondary • Hardware configurations • But only make-shift redundancies for DHCP
Basically, how DHCP works • Client: DHCPDISCOVER • Server: DHCPOFFER • Client:DHCPREQUEST • or DHCPDECLINE • Server: DHCPACK • Lease time • IP address • DNS server & other information
Basically, how DHCP works • Client goes into a “bound” state and starts the T1 and T2 timers • T1 is 1/2 the Lease time • T2 is about 80% of the lease time. • At T1, client sends a (unicast) DHCPREQUEST to renew the lease.
Basically, how DHCP works • 3 things can happen • Server says ‘No’ and client gives up address at the end of the lease • No response. Therefore, the client keeps trying until T2 when it sends out a broadcast. • Gets the renewal as desired. New lease starts here.
Basically, how DHCP works • Clients on different network as the server: use a DHCP relay that forwards DHCP communications from one subnet to the other.
Existing forms of DHCP redundancy • 2 DHCP servers, both active at the same time. • No synchronization or communications between servers • 2 disjoint address pools. • inefficient • wastes addresses. • Increases network recourses • both servers respond to clients
Existing forms of DHCP redundancy • Brute force: • Have a standby server and periodically save the lease database. • Performance problems. • Possibility of issuing one address to two clients. • Proprietary primary backup solutions • do not provide “safe” failover (1 address can be given to two clients).
Requirements for Failover servers • Cannot give two clients the same address. • The secondary should be able to take over for the primary. • Do not change the fundamental way that DHCP works. • Do not change the client • Server can change (al biet slightly) • Client to give up the lease when told to or at the end of the lease if it does not get renewed.
Things to address • How does primary server update secondary and when. • Failover assumes that an INIT_REBOOT does not have an existing address. This scenario can happen if the Client gets the 1st address while primary cannot talk to secondary, then reboots again.
Things to address • Server updates require stable storage to work reliably. Don’t want to add a significant amount of time that it already takes to do this. • Clients may not be on same network Therefore need to have a DHCP relay forward the DHCP stuff to a particular server to that that can send a request to more than one server.
Problems to be aware of • Primary crashes before it can update secondary. Secondary has no record of primary allocation (DHCPACK) • Primary and secondary cannot talk but clients can see both. (network partitioning) • Inherent to TCP connections, is keepalives to make sure that the secondary is there.
Problems to be aware of • In a TCP connection (as opposed to a UDP) will time out and will take up to 9 minutes. This usually cannot be changed. This is too long for a DHCP. RESULTS: TCP is useful for reliable message delivery, but cannot be depended upon do detect server failover.
Goals (continued) • Must work with existing clients • Must work with existing boot relay agents • Must provide failover redundancy between servers that are not located on the same subnet • Provide service to DHCP clients in the event of primary server failure.
Goals (continued) • Avoid binding (giving) and address to a client that another client already has. (no duplicate addresses) • Minimize the need for manual administration intervention. • Impose no additional client delays as a result of primary-backup communications • Share IP pools between primary and secondary servers
Goals (continued) • Handled partitioned networks. • Resynchronize without operator intervention when primary failure is corrected. • Enable one server to be secondary to many primary servers. • Allow proper lease renewal from either server.
Goals (continued) • If either server loses all of the information that it has stored in stable storage, it should be able to refresh from the other server.
Limitations • Only one secondary server. • Have a subset of addresses that only the secondary can hand out. • Neither server hand out addresses during a recovering failure.
MCLT • Maximum Client Lead Time • a “lease time” known to both the secondary and primary servers. • Places an upper bound on the difference allowed between the lease time given to a client by a server and the lease time known by the other server. • Is much less than the “real” lease time.
MCLT • Tell the client what the other server knows, plus MCLT • Tell the other server what the client wanted (or what the client was supposed to get) plus 1/2 of what it got • Don’t give the client more than what it asked for (or what it was supposed to get).
1 2 3 4 5 6 7 8 9 Practical Use Client Primary DHCP Server Secondary DHCP Server DHCPREQUEST 1 hour (MCLT) 1 day + 1/2 hour 1/2 Hour Later Renew Request 1 day (Lease) 1 day + 1/2 hour 1/2 Day Later Renew Request 1 day (Lease) 1 day + 1/2 hour
1 2 3 4 5 6 7 8 Practical Use Primary DHCP Server Primary DHCP Server Primary DHCP Server Client Secondary DHCP Server DHCPREQUEST 1 hour (MCLT) 1 day + 1/2 hour 1/2 Hour Later Renew Request (No Answer) Request Broadcasted 1 Hour (MCLT) “I’m Back” “Here’s what I’ve done”
Questions That’s all folks… Any Questions?
Getting the Slides Slides available via anonymous FTP: ftp://ftp.process.com/decus/europe_2000/dhcp_failover.ppt