80 likes | 95 Views
Introduction to locality sensitive approach to distributed systems. Outline distributed system definition distributed system model issues specific to distributed computing local sensitivity local representation clusters spanners using local representation.
E N D
Introduction to locality sensitive approach to distributed systems • Outline • distributed system definition • distributed system model • issues specific to distributed computing • local sensitivity • local representation • clusters • spanners • using local representation
Distributed system (DS) definition • Distributed system is distinguished • on architectural level by coupling level • tightly coupled (parallel machines) – synchronous processors, fast and reliable communication, shared memory • loosely coupled – independent processors, (relatively) infrequent communication, limited cooperation • purpose of cooperation – provide individual users convenient and efficient access to shared resources • via single-system image - the user is supplied with a centralized view of the system: the system is composed of a single entity located in one place and dedicated to serving that particular user (the distributed nature of the system is hidden)
System model • message-passing model – processes do not share memory and communicate by passing messages • shared memory is not considered • point-to-point communication – direct message exchange between pairs of processes • broadcast media (wireless, busses) is not considered • p2p communication may be anonymous: process is aware of the outgoing ports of the channels but not aware of the receiving entities
Distributed computing issues • communication – communication costs tend to dominate the execution • incomplete knowledge • each process may not know the complete input, network topology, stage of the (global) execution • failures – due to loose coupling both the occurrence and handling of faults is specific to DS • greater incidence of individual faults • potential to fault-tolerance dues to processors’ autonomy
Timing, synchrony, nondeterminism two extreme models • synchronous model – execution proceeds by pulses or cycles containing the following steps • send messages to neighbors • receive messages from neighbors • perform local computation • asynchronous model – execution is event-driven (the processes cannot consult clock): local computation and message transmission takes arbitrary long • due to arbitrary order of message delivery the execution of a distributed system is nondetermenistic – running the same algorithm with the same inputs may produce different results
Local-sensitivity • traditionally DS algorithms (routing, broadcast, topology update) require each process to maintain the information about the whole network • scalability problems • global knowledge is not always necessary • many tasks can be solved such that each process involves only processes in a small region around it • also, it is desirable that the cost of solving the task is proportional to the size of the region involved (not the size of the whole network)
Locality-preserving (LP) network representations • we consider arbitrary topologies • idea – minimize (computational and storage) costs of execution by letting each process keep an approximate view of the topology of the system • LP-representation of the topology of the system allows each process to keep enough information about the system so as to accomplish the computing task • two types of LP-representations • clustered – grouping processes in a system in connected subsets (clusters) • skeletal – maintaining the information about sparse panning subgraphs of the network
Using LP-representations in computing • idea – develop applications that use the LP-representations and thus minimize costs of computing • clusters – save if most of the communication in the application within a cluster • skeletal representations – ignore non-represented edges, any application can be applied directly. The solution may be less “accurate” (in terms of network representation) but also less “costly” (in terms of bandwidth)