340 likes | 364 Views
Explore the core concepts of distributed systems transparency, coordination, and architecture models with a focus on efficiency, flexibility, consistency, and robustness. Learn about service primitives, scalability, and parallelism.
E N D
6306 Advanced Operating Systems Instructor : Dr. Mohan Kumar Room : 315 NH kumar@cse.uta.edu Class : TTh 7- 8:20PM Office Hours : TTh1-3 PM GTA : Byung Sung sung@cse.uta.edu Kumar CSE@UTA
Distributed SystemsR Chow, Distributed Operating Systems • Provide a conceptually simple model • Efficient, flexible, consistent and robust • Goals • Efficiency • Communication delays?? • Scheduling, load balancing • Flexibility • Users decide how, where, and when to use the system • Environment for building additional tools and services • System’s view - scalability, portability, and interoperability • Consistency • Lack of global information, potential replication and partitioning of data, component failures and interaction among modules • Robustness Kumar CSE@UTA
Transparency • Logical view of a Physical System • Reduce the effect and awareness of the physical system • System-dependent information • Tradeoff between simplicity and effectiveness • Access transparency –local and remote objects • Location transparency –objects refs by logical names • Migration transparency –location independence • Concurrency transparency • Replication transparency • Parallelism, Failure, Performance, Size (modularity and scalability), Revision (software) Kumar CSE@UTA
Transparency Kumar CSE@UTA
Transparency Kumar CSE@UTA
Services • Primitive • Message passing primitives – send and receive • Synchronization • Process management • Process creation, deletion, tracking, resource allocation • System servers • Name server/directory server • Users,processes, machines,files,ports • Network server • Path selection and routing • Broadcast/multicast server • Clock Synchronization Kumar CSE@UTA
Architecture Models • Workstation-server model • Processor-pool model • Communication network protocols Kumar CSE@UTA
Distributed Coordination • Barrier Synchronization • Condition coordination • Mutual Exclusion Interprocess communication Distributed Resources Distributed Shared memory Kumar CSE@UTA
Two Processor Model Indurkhya and H Stone, IEEE Software, 1997 Consider execution of application program that contains M tasks on a Two processor (or PE) system We make the following assumptions to simplify analysis 1. Each task executes in R units of time all tasks are of equal complexity 2. Each task communicates with every other task with an overhead cost of C units of time when the communicating tasks are on different PEs and with zero cost when the communicating tasks are co-resident 3. Non-overlapped computation/communication Kumar CSE@UTA
(k) (M-k) Two PE model (Contd.) Kumar CSE@UTA
Two PE model (Contd.) * Assign all tasks to one PE or * Partition tasks among the two PEs Total execution time (TET) is the sum of the total computation time and the communication overheads Let us assign (M-k) tasks to one PE and k tasks to the other Computation time = R Maximum (M-k,k) Communication time = C (M-k)*k Total Execution time = R * (Max (M-k,k) + C *(M-k)*k Linear Quadratic What is the maximum TET as a function of k? PA252/502 Week 8 Kumar CSE@UTA
Two PE model (Contd.) R/C = 10 PA252/502 Week 8 Kumar CSE@UTA
60 50 comput 40 . comm. 30 20 TET 10 0 10 20 25 30 40 50 50 40 30 25 30 40 50 comput. 0 10 15 15.63 15 10 0 comm. 50 50 45 40.63 45 50 50 TET Two PE model (Contd.) R/C = 40 0 Minimum execution time when R/C = M/2 Kumar CSE@UTA
60 50 comput . 40 comm. 30 TET 20 10 0 0 10 20 25 30 40 50 50 40 30 25 30 40 50 comput. 0 16 24 25 24 16 0 comm. 50 56 54 50 54 56 50 TET Two PE model (Contd.) R/C = 25 Kumar CSE@UTA
Two PE model (Contd.) Total time on one processor = R* M units … (1) R/C = M/2 is the critical point for 2 processor system if R/C M/2 assign all tasks to one Processor if R/C > M/2 divide tasks equally among all processors Divide equally among the two processors Then TET = R(M/2) + C(M-M/2)*M/2 … (2) To get break even point equate (1) and (2) R(M/2) + C(M-M/2)*M/2 = R*M, Simplifying, we get R/C = M/2 Kumar CSE@UTA
Two PE model (Contd.) If R/C high then C is low, communication overheads are low, we are justified in parallelising If R/C is low then C is high, communication overheads are high better to execute on one processor Speedup = Serial time/Parallel time > 1 if and only if R/C >M/2 Check with the times Kumar CSE@UTA
N PE model All processor share a common communication channel Assign ki tasks to Pei k1 - PE1 k2 - PE2 … kN - PEN k1+ k2+ . . . + kN= M Computation time = R Max (ki) Communication time = C/2[k1*(M-k1)+k2*(M-k2)+ . . . +kN*(M-kN) Kumar CSE@UTA
N PE model Communication time = C/2[k1*(M-k1)+k2*(M-k2)+ . . . +kN*(M-kN) = C/2 ki(M-ki) = C/2[ (M ki) - (ki2)] = C/2 [M2 - ki2] 3 3 Kumar CSE@UTA
N PE model Total Execution time = R* Max(ki) + C/2 [M2 - ki2] assuming all tasks to be of equal complexity, we assign M/N tasks per PE Total execution time = R(M/N)+CM2/2(1-1/N) … (3) ki2 = N(M/N)2 = M2/N; because, ki=M/N Equate equation 3 with RM to get the break even point R(M/N)+CM2/2(1-1/N) = RM RM(1-1/N) = CM2/2(1-1/N) R/C = M/2 if R/C M/2 assign all tasks to one Processor if R/C > M/2 divide tasks equally among N PEs Kumar CSE@UTA
N PEs with overlapped communication The PEs can compute and communicate at the same time TET = Max {computation time, communication time} = Max {R* Max(ki), C/2 [M2 - ki2} assuming all tasks to be of equal complexity, = Max {R(M/N), (CM2/2)*(1-1/N)} optimum performance when computation time = communication overhead R(M/N) = (CM2/2) * (1-1/N), ignoring 1/N if N is large, R(M/N) = CM2/2 R/C = MN/2; optimum number of PEs, N = 2R/MC Kumar CSE@UTA
N PEs with multiple communication links N Communication channels in the network Total Execution time = R* Max(ki) + [C/(2N)]* [M2 - ki2] assuming all tasks to be of equal complexity, Total execution time = R(M/N)+CM2/2N(1-1/N) Kumar CSE@UTA
N PEs with multiple communication links if R/C M/2N assign all tasks to one Processor if R/C > M/2N divide tasks equally among N PEs Speedup = TET on ONE PE / (TET on N PEs) = If R/C >> M/2 then speedup increases with N if R/C << M/2 then speedup is given by, 2NR/MC Kumar CSE@UTA
Comments on Performance Multiple number of PEs produce overhead costs - scheduling - contention for shared resources - message passing - synchronisation - lack of load balancing Overhead costs increase with the number of PEs R/C is a measure of the amount of program execution per unit of overhead If R/C is large, Communication overheads are low and the problem execution is efficient If R/C is small, Communication overheads are high ….. Kumar CSE@UTA
Active Middleware Services in a Decision Support System for Managing Highly AvailableDistributed ResourcesS A Fakhouri, W F Jerome, V K Naik, A Raina and P Varma • The cluster based system is called Mounties • Manages resources and applications using rule based constraints • Mounties has four service components • Resource proxy objects • Manipulates cluster configuration • Event notification mechanism • Monitors and controls interdependent and distributed resources • Rule Evaluation and Decision processing mechanism • Global optimization service • Provides decision making capabilities Kumar CSE@UTA
What’s in the paper? • Architecture and design of service components • Interaction between the components and the decision making component • General programming paradigm Kumar CSE@UTA
Cluster? Nodes,disks, adapters,databases etc. Heterogeneous Components of a Cluster Transparent Modular Scalable Achieved by providing coordination and mapping of physical distributed resources onto a set of virtual resources and their services – very complex? Kumar CSE@UTA
Authors’ approach • Clusters and resources – 2 dimensions • Semistatic nature of resource • Type and quality of supporting services needed to enable its services • Formalized as simple rules • Dynamic state of the services provided by the cluster – captured by events • Coordination and mapping – centralized and rule-based • Events and rules are combined only when necessary Kumar CSE@UTA
Resources • Attributes • Unique Name • Type – functionality • Capacity – number of dependent sources it can serve • Priority (L1 .. 10H) • State • ONLINE • OFFLINE • FAILED Kumar CSE@UTA
Mounties Approach • Constraint-based methodology • cluster configuration, startup and recovery • Constraints are used build relationships among supporting and dependent resources/services • Nominal State – ONLINE, OFFLINE • Inter-resource relationships • DependsOn • CollacatedWith • Anti-CollocatedWith • Equivalency • Set of resources with similar functionality • Choose one of these Kumar CSE@UTA
Distinct Decision making, Resource allocation Monitor, control, manipulate Example from the paper E1: DA0,DA1; E2: NA0,NA1, NA2 Mounties.. Database 1 E1: DA0 E2 : NA0 Node 0 Database 2 E1: DA1 E2 : NA1 Node 1 Database 1 DO E1, E2 CW E1,E2 Kumar CSE@UTA
Mounties.. Webserver D : E2,E3 CW : E2 E2: NA2 E3: DB1 Node 2 Kumar CSE@UTA
Events and Clusters • Resource related • Response to a command • Enduser interactions/directives • Alerts/alarms Resource Specific Constraints capacity, dependency, location, Kumar CSE@UTA
To subscribe to LISTSERV Send a message to To: listserv@listserv.uta.edu Subject: blank (no subject heading) -------------------------------------- subscribe cse6306-l <student’s name> OR go to http://listserv.uta.edu/cgi-bin/wa.exe?SUBED1=cse6306-l&A=1 and fill out the form Kumar CSE@UTA