390 likes | 431 Views
Distributed Computing Principles. Keith Marzullo. It’s all about distributed systems now…. What do they have in common?. Independent processors that communicate via a narrow interface. Contrast with shared memory. True concurrency. Contrast with multitasking. Partial failures
E N D
Distributed Computing Principles Keith Marzullo
What do they have in common? • Independent processors that communicate via a narrow interface. • Contrast with shared memory. • True concurrency. • Contrast with multitasking. • Partial failures • Things get messy when only parts of the system stop working. • Things can get really messy when parts of the system can try to do bad things.
Partial failures and availability A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. Leslie Lamport at DEC SRC in 1980s
Partial failures Assume five computers crash and recover independently, and each crashed about 4 minutes a month (probability up = 0.9999). • Probability all up? • Probability one up?
Partial failures Assume five computers crash and recover independently, and each crashed about 4 minutes a month (probability up = 0.9999). • Probability 5 up? 0.9995 (down 20 min/month) • Probability ≥ 1 up? 1 - 1020 (down 20 femtosec/month)
Fundamental problems… • Reasoning about the state of a distributed system. • Methods and protocols for making systems mask or detect and recover from failures. • Protocols for agreeing on state, message delivery, process failures and recovery, etc. ... in the context of different assumptions about the environment (communication, clocks, failures...)
A simple distributed coordination problem Two armies are camped on separate hills surrounding a valley in which the enemy army is encamped. The enemy can vanquish each army separately, but if both armies attack at the same time, they can vanquish the enemy.
Two General's Problem Attack at dawn! due to Jim Gray
Two General's Problem due to Jim Gray
Two General's Problem I know to attack at dawn... due to Jim Gray
Two General's Problem I don't know that he knows to attack at dawn... due to Jim Gray
Two General's Problem due to Jim Gray
Two General's Problem I know he knows to attack at dawn... due to Jim Gray
Two General's Problem I don’t know he knows I know to attack at dawn... due to Jim Gray
Two General's Problem I know he knows that I know to attack at dawn... due to Jim Gray
Two General's Problem umm... umm... Zzz... due to Jim Gray
No finite protocol: proof Assume a protocol exists, and consider the shortest run that leads to coordinated attack. attack! ... attack!
No finite protocol: proof attack! ... attack!
No finite protocol: proof attack! ... attack!
Two Generals in practice Deduct $300 Issue $300 Question: what do banks do?
Muddy children • Seven school children are playing outside after it rained. • These are smart children- they’ve been studyinglogic and all reasonflawlessly. • At one point theirteacher says to them“at least one of you has a muddy forehead”.
Muddy children • The teacher asks them “raise your hand if you know you have mud on your forehead” • There were no mirrors, the students didn’t feel their foreheads, and no one pointed at anyone; a student could only determine it by thinking. • When no one raised their hand, the teacher asked again. • When no one raised their hand, the teacher asked again. Some students raised their hands. • How many did so? • How did they know?
More generals Commander (Meng Tian) Lieutenant (Julius Caesar) Lieutenant (George Patton)
More generals Attack!
More generals Attack! Attack! Attack!
More generals Attack! Attack! Retreat!
More Generals • Three generals: a commander and two lieutenants. • Communications is guaranteed. • One general could be a traitor, or all three could be loyal. • The commander will tell each lieutenant a command: attack or retreat. • If the commander is loyal, then the loyal lieutenants need to obey the commander. • If the commander is a traitor, then the lieutenants need to follow the same command (it doesn’t matter which).
attack attack attack retreat Not solvable either… attack
retreat retreat attack retreat Not solvable either… retreat
attack retreat attack retreat Not solvable either… retreat attack
Replicated sensors up to one faulty sensor up to two faulty sensors all sensors correct 4 - 9 1 - 6 2 - 7 1 - 9 4 - 6 2 - 7 Mfn(S): smallest interval that intersects at least n - f input values.
Some problems to think about • How would you solve agreement with one commander and three lieutenants? • How would you solve agreement with one commander and two lieutenants, where a general is either loyal or dead?
Principles of distributed systems • These kinds of problems come up all the time but in different circumstances. • The details are vital - can make a problem go from being easy to impossible. • Good distributed systems design requires good engineering on top of sound principles.