• 400 likes • 546 Views
Distributed Synchronization. In single CPU systems Semaphores and monitors Essentially shared memory solutions How about distributed synchronization? Relevant information is scattered Processes make decisions based on local information
E N D
Distributed Synchronization • In single CPU systems • Semaphores and monitors • Essentially shared memory solutions • How about distributed synchronization? • Relevant information is scattered • Processes make decisions based on local information • A single point of failure in a system should be avoided • No common clock or other precise global time source exists ICSS741 - Time and Coordination
What Is Needed • In order to coordinate events in a distributed system • We may need to know the time at which a particular event took place • We may need to determine the order in which two events took place, or should take place, without respect to the time they actually occur • Synchronization requires either • Global time • Global ordering ICSS741 - Time and Coordination
Time Synchronization • Is it possible to synchronize all clocks to produce a single, unambiguous time standard? • Time synchronization need not be absolute • What usually matters is that processes agree on the order in which events occur (not necessarily the time at which they occur) ICSS741 - Time and Coordination
Time and Coordination • Basically two problems with time • External synchronization • Synchronize clocks with an authoritative external source of time • Internal synchronization • The internal consistency of the clocks is what matters, not whether they are close to the real time ICSS741 - Time and Coordination
Astronomical Time • Since the 17th century time has been measured astronomically • The event of the sun reaching the highest point in the sky is called the transit of the sun • The interval between two consecutive transits of the sun is called a solar day • In the 1940s, it was established that the earth’s rotation is not constant • The earth is spinning slower • 300 million years ago there were about 400 days per year ICSS741 - Time and Coordination
Atomic Time • The atomic clock was invented in 1948 • One second is the time it takes the cesium 133 atom to make 9,192,631,770 transitions • Currently about 50 cesium-133 clocks exist • Periodically they are averaged to produce international atomic time (TAI) • The Bureau International de l’Heure (BIH) maintains the official clock ICSS741 - Time and Coordination
Leap Seconds • Currently about 86,400 TAI seconds is about 3msec less than a mean solar day • Not a problem until noon becomes 6am • BIH solves the problem by inserting leap seconds to compensate for the difference • Leap seconds are added whenever the discrepancy grows to 800 msec • Power companies will increase their frequencies to compensate • UTC (Universal coordinated time) is the result ICSS741 - Time and Coordination
Obtaining Accurate Time • UTC is an international standard for the current time • WWV shortwave radio from Fort Collins (accuracy 0.1 – 10 milliseconds) • GEOS satellites (0.1 milliseconds) • GPS satellites (1 millisecond) ICSS741 - Time and Coordination
Physical Clocks • Computer each contain their own physical clocks • Timer might be a better word… • Utilize crystal that oscillate at a known frequency • A count of the oscillations is maintained • Software typically takes this count, divides it down, and stores it as a number in a register • Most systems provide date/time from the counter • Ordering events, in a single machine, with such a clock is easy • Provided the clock resolution is fine enough ICSS741 - Time and Coordination
Clock Drift • Crystal-based clocks are subject to drifting • the change in the offset between the clock and a nominal perfect reference clock per unit of time measured by the reference clock • Typical drift rates • Quartz crystals – 10-6 (about a difference of one second every 1,000,000 seconds or 11.6 days) • Atomic clocks – 10-13 ICSS741 - Time and Coordination
External Synchronization • Lets say you have access to a UTC time source • Assume the machine has a timer that causes an interrupt H times a second • Current clock value is C • When UTC time is t, the value of the clock on machine p is Cp(t) • Ideally Cp(t)=t for all p and t (dC/dt should be 1) ICSS741 - Time and Coordination
Maximum Drift Rate dC/dt > 1 dC/dt = 1 Fast clock Perfect clock dC/dt < 1 Slow clock ICSS741 - Time and Coordination
Synchronizing Physical Time • What exactly does it mean to synchronize two clocks? • Clocks inherently suffer from drifting • Assuming clocks can always be precisely synchronized in unrealistic • Define an acceptable range for the difference in time reported by two clocks (clock skew) • A distributed physical clock synchronization service defines, and maintains, a maximum skew throughout the system. ICSS741 - Time and Coordination
The Basic Algorithm • A wants to read B’s clock • A sends a request to B • B records its current clock value • The clock value is sent back to A • B’s clock value is adjusted to reflect travel time • B’s clock value can now be compared to A’s • Step 4 is difficult to implement accurately ICSS741 - Time and Coordination
Interesting Question • So you have to adjust your time • Your clock is slow – move it ahead • Your clock is fast – move it back? • Implementations • Slow down your clock so it will continually move towards the real time • Speed up your clock so it will move towards the real time • Just move your clock ahead to the real time ICSS741 - Time and Coordination
Cristian’s Algorithm • One machine knows the true time • Periodically each machine sends a request for the current time T0 Request Time I, interrupt handling time CUTC T1 Measured with the same clock ICSS741 - Time and Coordination
Transit Time • Estimating propagation time • ( T1 – T0 ) / 2 • ( T1 – T0 – I ) / 2 • If minimum possible propagation delay is known, the estimate can be made better • Accuracy can be improved by taking several measurements • Any measurement in which T1 – T0 exceeds a threshold is discarded (congestion) ICSS741 - Time and Coordination
ICMP Timestamp Request/Reply Type (17 or 18) Code (0) Checksum Identifier Sequence Number 32-bit originate timestamp 32-bit receive timestamp 32-bit transmit timestamp Same clock so difference is accurate rtt ICSS741 - Time and Coordination
The Berkeley Algorithm • Time server is active, and polls each machine periodically for its time • Based on the answers, an average time is computed • A fault-tolerant average is used • Machines are then told to slow down, or speed up their clocks • Suitable for systems where no UTC source is available ICSS741 - Time and Coordination
Berkeley Algorithm 740 740 Current Time = 720 Move clock forward 7 Current Time = 740 Adjusted TimeA = 730 Adjusted TimeB = 742 Average = 737 Current Time = 737 720 737 +7 Network delay = 10 Network delay = 5 ICSS741 - Time and Coordination
Network Time Protocol • NTP is used to synchronize the time of a computer client to another server or reference time source • Client accuracies are typically within a millisecond on LANs and up to a few tens of milliseconds on WANs • NTP configurations utilize multiple redundant servers and diverse network paths in order to achieve high accuracy and reliability • Configurations can use authentication to prevent accidental or malicious protocol attacks ICSS741 - Time and Coordination
NTP Strata Primary Servers (stratum 1 ) sync to UTC source Secondary Servers (stratum 2 ) sync to primary servers Workstations ICSS741 - Time and Coordination
USNA NTP Time Servers ICSS741 - Time and Coordination
Rules of Engagement • Clients should avoid using the primary servers whenever possible • In most cases the accuracy of the NTP secondary (stratum 2) servers is only slightly degraded relative to the primary servers • As a group, the secondary servers may be just as reliable ICSS741 - Time and Coordination
When to Use a Primary • As a general rule • The secondary server provides synchronization to a sizable population of other servers and clients • The server operates with at least two and preferably three other secondary servers in a common synchronization subnet • The administration(s) that operates these servers coordinates other servers within the region, in order to reduce the resources required outside that region. • In order to ensure reliability, clients should spread their use over many different servers ICSS741 - Time and Coordination
NTP Servers • http://www.ntp.org (home page for NTP) • List of Primary Servers (100) • http://www.eecis.udel.edu/~mills/ntp/clock1.htm • List of Secondary Servers (110) • http://www.eecis.udel.edu/~mills/ntp/clock2.htm • Our server • timehost.cs.rit.edu ICSS741 - Time and Coordination
Synchronization Modes • Servers synchronize in one of three modes • Multicast • Used on high speed LANs • Servers periodically broadcast their time • Low accuracies, but efficient • Procedure-call • Similar to the operation of Cristian’s algorithm • Symmetric • Used by master servers • Pairs of servers exchange information • Timing data is retained in order to improve accuracy ICSS741 - Time and Coordination
NTP Design Goals • The four primary design goals of NTP are • Allow accurate UTC synchronization • Enable survival despite significant losses of connectivity • Allow frequent resynchronization • Protect against malicious or accidental interference ICSS741 - Time and Coordination
Accurate Synchronization • NTP provides the following information relative to the primary server • Clock offset • Difference between the two clocks • Round-trip delay • Total transmission time for the messages • Dispersion • Offsets are predicted • Dispersion is a measure of how much the prediction differs from what what reported • Large dispersion values indicate inaccuracy ICSS741 - Time and Coordination
Logical Clocks • Since physical clocks cannot be perfectly synchronized across a distributed system • Physical time cannot be used to determine the order in which events occur • Logical clocks can be used to order events within a distributed system • The essence of a logical clock is the happens-before relationship ICSS741 - Time and Coordination
Happens-Before • The happens-before relationship is denoted a b • If a and b are events in the same process, and a occurs before b, then ab • If a is the event of a message being sent by one process, and b is the event of the message being received by another process, then ab • If a b, and bc, then ac • Any two events that are not in a happen-before relationship are concurrent ICSS741 - Time and Coordination
Events ICSS741 - Time and Coordination
Lamport’s Logical Clock • To obtain logical ordering, timestamps that are independent of physical clocks are used • Lamport clocks follow these rules • Each process increments it clock between every two consecutive events • If a sends a message to b, the message includes T(a). Upon receipt, b sets its clock to the greater of T(a)+1 and the current clock ICSS741 - Time and Coordination
Lamport’s Algorithm 0 8 16 24 32 40 48 61 69 77 85 0 10 20 30 40 50 60 70 80 90 100 0 8 16 24 32 40 48 56 64 72 80 0 10 20 30 40 50 60 70 80 90 100 0 1 2 3 4 5 6 7 8 70 71 0 1 2 3 4 5 6 7 8 9 10 a a b b c c d d e e f f A(6) F(10) B(24) C(50) D(60) E(64) A(6) B(24) C(50) D(60) E(64) F(71) ICSS741 - Time and Coordination
Partial Ordering • If a b then L(a) < L(b) • Note that • L(d) < L(e) • Does not imply that • d e • Since d and e might be concurrent • Plus L(a) might equal L(b) ICSS741 - Time and Coordination
Example ICSS741 - Time and Coordination
Total Ordering • Between every event the clock must tick at least once • Since events cannot happen at the same time, attach the process number to the low-order end of the time, separated by a decimal point • Now • If a happens before b in the same process, C(a) < C(b) • If a and b represent the sending an receiving of a message, C(a) < C(b) • For all events a and b, C(a) is not equal to C(b) ICSS741 - Time and Coordination
Vector Clocks • Vector clocks were designed to overcome the shortcomings of Lamport’s clocks • A vector clock is an array of times • The rules: • Initially, Vi[j]=0, for i,j = 1,2 …, N • Just before pi timestamps an event, it increments Vi[i] • piincludes the value t = Vi in every message it sends • When pi receives a timestamp in as message, it takes the component-wise maximum of the two vector timestamps ICSS741 - Time and Coordination
Example ICSS741 - Time and Coordination
Comparing Timestamps • Vector timestamps are compared as follows • V = V’ iff V[j] = V’[j] for j=1,2,…,N • V <= V’ iff V[j] <= V’[j] for j=1,2…,N • V < V’ iff V<=V’ and V != V’ • So what? • If V(e) < V(e’) then ee’ • c and e are concurrent since neither V(c) <= V(e) nor V(e)<=V(c) ICSS741 - Time and Coordination