Data Replication CS 188 Distributed Systems February 3, 2015

Data Replication CS 188Distributed SystemsFebruary 3, 2015

Some Other Possibilities • What if the machines sharing files are portable and not always connected? • What if the machines communicate across the Internet? • What if the load on some files is too heavy for a single machine?

An Answer to These Questions • Replicate the data • Keep multiple copies of the data on different machines • Depending on details, make different copies available for different purposes

How Does This Help? • What if the machines sharing files are portable and not always connected? • Put a replica of the data on the portable machine • What if the machines communicate across the Internet? • Avoid expensive cross-Internet traffic by having replicas on both sides • What if the load on some files is too heavy for a single machine? • Share the load among multiple replicas

Other Replication Advantages • Reliability • If one machine fails, replicas of its data might be elsewhere • Flexibility • Easier to assign data workloads to storage resources

The Replication Concept When in the course of human events it becomes necessary for one people to . . . When in the course of human events it becomes necessary for one people to . . . When in the course of human events it becomes necessary for one people to . . . When in the course of human events it becomes necessary for one people to . . . When in the course of human events it becomes necessary for one people to . . . There is a conceptual object (like a file) We keep more than one physical copy of it Maybe several Each copy is meant to be a full representation of the object So accessing any should be the same as accessing any other

Replication and Caching • The two are obviously similar • Caching usually implies it’s temporary • Replication usually implies it’s permanent • Caching is usually for local use only • Replication is usually for more general use • These distinctions are not actually binary, though • Permanent isn’t always really permanent • Some caches service multiple machines

There Are Some Differences • For example, invalidation on write is feasible for cached data • It isn’t feasible for replicated data • One can always throw away a cached copy of data (modulo local needs) • One can’t always throw away a replica • Especially the only one

Replication and Reading • If the data is read-only, the replication problem is easy • IF . . . • The problems arise if the data is ever written • Life then becomes much more complicated

Read-Only Replication • Merely ensure that all copies start off the same • They never change • Accessing any copy as good as any other • Still a problem of finding and choosing replicas to access

Read-Only Data and Metadata • Usually we treat file metadata as part of the file • Maybe the data is read only • But is the metadata? • How about access permissions? • How about access time? • If metadata can be updated, you still have issues

Choosing Read-Only Replicas • Mostly a performance question • Which one is “closest?” • Which one is “least loaded?” • Initial placement might make a big difference • And what if replicas can move?

Varying Read-Only Replication Factors • We can add or delete read-only replicas easily • Some issues regarding open files • When should we add a replica? • When should we delete a replica? • When should we move a replica to a different location?

Replication and Writing • Life becomes complicated when you write replicated data • Physically the write occurs at one copy • Logically the write should be applied to all copies • Going from the physical reality to the logical goal is challenging

Illustrating the Problem When in the course of human events it becomes necessary for one people to . . . Forescore and seven years ago, our forefathers brought forth . . . When in the course of human events it becomes necessary for one people to . . . Forescore and seven years ago, our forefathers brought forth . . . We write to the yellow replica The yellow and blue replicas should be the same, but they aren’t What do we do? Problem solved! But . . .

A Fly in the Ointment When in the course of human events it becomes necessary for one people to . . . Forescore and seven years ago, our forefathers brought forth . . . When in the course of human events it becomes necessary for one people to . . . We’ve gotten ourselves into this state What if the writer’s next access is to the other replica?

A Worse Situation When in the course of human events it becomes necessary for one people to . . . Forescore and seven years ago, our forefathers brought forth . . . What if someone else reads the other copy?

An Even Worse Situation When in the course of human events it becomes necessary for one people to . . . Ask not what your country can do for you, but what you can do for your country Forescore and seven years ago, our forefathers brought forth . . . What if someone else writes the other copy?

These Situations Arose Before Distributed Computing • What if there are two processes on one machine? • What if they read a file and then both choose to write it? • Or one writes without the other’s knowledge? • Still problematic, but easier to solve

Single Machine Solutions • Have only one copy of shared data • Replication advantages less on a single machine, anyway • Use locks to control access to shared data • Both solutions rely on a single piece of storage that both parties consult • So they don’t work on two machines

Cross-Machine Locking • Why can’t I just share a lock between two machines? • A lock is really a piece of data • Saying who holds it • Either you store it on one machine or on both • Storing on just one leads to performance and reliability problems • Storing on both gets us back to our original problem • But now the shared data is the lock itself

Primary Copy Options • Only allow writes to one replica • So no issue of conflicting writes to different replicas • Doesn’t solve the read/write concurrency problem • Issues if the primary copy fails • Or if its server is overloaded • Or if there are network partitions

A Diversion Into Clocks • Ultimately, these issues relate to the question of ordering events • What order do things happen in? • In a distributed system • One form of ordering used a lot in the real world is time • Can we use time to solve our problem?

Time Services • One way to make things happen in order is to timestamp them • Read a clock and slap a time stamp on the event • As in normal life, things only happen in time order • Possible solution for ordering distributed events

Time Services and Replication • Maybe we can slap a timestamp on every write • And maybe use timestamps to control reads • The timestamps of multiple writes control the order in which they occur • Doesn’t solve all the problems, but does solve some

Read the clock Read the clock 3:15 3:15 To B Read the clock To C 3:15 To B 3:22 3:27 3:15 3:27 3:22 Using a Clock Node 2 Node 1 A A A B B C C Node 3 Now B can know the proper order of writes

The Problem With Clocks • A clock is (ultimately) a physical resource • So it’s in exactly one place • We use messages to access remote places • And messages take varying amounts of time to get from one place to another • So, with a single clock, can’t guarantee proper ordering

Solutions to Clock Problems • Physical clocks • Logical clocks

Physical Clocks • Each node keeps its own local clock • Modern machines always have them, anyway • Stamp each synchronizable event with the local clock • Problem becomes keeping the clocks synchronized

Globally Accessible Clocks • In the general case, this usually means GPS clocks • GPS satellites broadcast highly accurate clock signals • Over the entire Earth’s surface • Anyone with a GPS receiver that’s working can hear it

Pros and Cons of Physical Clocks • Simplicity • Need constant access to clock • Transmission errors/delays damage synchronization • Requires strong knowledge of transmission delays • Never possible to reduce clock skew to zero

Logical Clocks • Don’t try to keep track of passage of actual time • Use a logical mechanism to keep track of proper order of events • Essentially, assign artificial timestamps that maintain the causality required for the computation

When Are Logical Clocks Useful? • When relative order of events is the issue • Rather than relationship to wall clock time • Often the case for operations of distributed applications • Not always when there is a relationship to the real world

Lamport Clocks • Fundamental logical clock system • Each process Pi has a clock Ci • Each event is assigned a time at its processor • is the happens-before relation ab means a happened before b • If ab, C(a) < C(b)

Implementing Lamport Clocks • Whenever an event occurs, increment the local clock • Assign new value to event • But how do we provide the correct global view? • Since processes live on different processors

Handling Messages in Lamport Clocks • Processes communicate only via send and receive of messages • Which are events • If Pi sends to Pj, Ci(send) < Cj(receive) • Since send must happen-before receive • How do we force that?

Rules for Lamport Clocks 1). If ab within the same process, C(a) < C(b) 2). If a is a sending event in Pi and b is the corresponding receiving event in Pj, then C(a) < C(b) • Enforcing Rule 1 is easy, since it’s on the same processor

Enforcing Rule 2 • Timestamp outgoing messages with time of send • Receiver j adds increment d to maximum of message timestamp and local clock • Cj= max(C(a), Cj) + d • C(b) = Cj • Ensures that receive event b gets a clock value after send event a

a send 2 0 1 2 2 2 receive 0 0 0 2 3 Lamport Clocks Example 1 i 1 2 j 3 C(a) =1, C(send) = 2, C(receive) = 3 C(a) < C(send) C(send) < C(receive)

Properties of Lamport Clocks • Happens-before is transitive • If ab and bc, then ac • If ab, then C(a) < C(b) • But the converse is not true • C(a) < C(b) does not imply ab • How can that happen?

a d b 2 2 0 1 0 0 1 0 Lamport Clock Example 2 i 1 2 j 1 C(a) =1, C(b) = 2, C(d) = 1 C(a) < C(b) C(d) < C(b) ????!!!!????!!!!

The Sad Truth About Distributed Systems Concurrency • Abandon all hope ye who enter here • You’ve got to forget your godlike view • In the absence of a physical clock, • YOU CAN’T ORDER ALL EVENTS PROPERLY!!!!!!!! • But perhaps you don’t believe that . . .

a d b 2 0 0 1 1 1 1 0 Lamport Clock Example 3 i 1 2 j 1 C(a) =1, C(b) = 2, C(d) = 1 But the “order” of events was different than before

Why Do We Have This Problem? • Not really because we aren’t keeping a physical clock • It’s because we aren’t communicating enough to derive the order • If each process sent the other a message after each local event, our examples would have proper ordering

d send b a 1 3 2 0 3 3 3 Synchronize receive 0 0 5 0 4 0 Obtaining the Proper Order for Example 2 i 1 2 3 j 4 5 C(a)<C(b), C(b)<C(d)

receive a d b send 4 5 3 0 0 0 2 Synchronize 2 2 1 0 2 2 And For Example 3 i 3 4 5 j 2 1 C(d) < C(a), C(a) < C(b)

But There’s a Problem • What if we have true concurrency? • What if an event occurs while a synchronization message is in transit?

d a send 1 0 0 0 2 Synchronize 2 0 2 1 Lamport Clocks Example 4 i 1 j 2 1 C(d) = C(a) Because of concurrency, you can’t win

Lamport Clocks and Partial Orders • Basic Lamport clocks only give a partial order • They don’t order events with equal times • Easy to provide a full order • Number all processes • Concatenate process number to clock

In Our Examples, • Say process i is numbered 1 and process j is numbered 2 • In example 1, no equal times • In example 2, • C(a) = 1,1 • C(b) = 2,1 • C(d) = 1,2 • So C(a) is ordered before C(d)

Data Replication CS 188 Distributed Systems February 3, 2015

Data Replication CS 188 Distributed Systems February 3, 2015

Presentation Transcript

Distributed Systems Course Replication

CS 603 Data Replication

Distributed systems II Replication

Distributed systems II Replication Cnt .

Distributed Storage Systems: Data Replication using Quorums

Distributed Systems Course Replication

CS 425: Distributed Systems

CS-556: Distributed Systems

CS 775 : Distributed Systems

Distributed systems II Replication

CS 347: Parallel and Distributed Data Management Notes07: Data Replication

Introduction CS 188 Distributed Systems January 6, 2015

CS 194: Distributed Systems Distributed File Systems

CS 6601 – Distributed Systems

CS 425: Distributed Systems

CS 194: Distributed Systems Distributed based Object Systems

February 3, 2015

Recovery From Failure in Distributed Systems CS 188 Distributed Systems February 26, 2015

When Is Agreement Possible? CS 188 Distributed Systems February 24, 2015

Examples of Remote File Systems CS 188 Distributed Systems January 29, 2015