480 likes | 762 Views
Topics in Distributed Systems. CS 5580F. Topics. Course requirements Introduction Communications Naming Synchronization Consistency & Replication Fault Tolerance Security. Course Requirements. Grading, etc. Paper 50% Presentation 15% Programming project 35% Scale:
E N D
Topics in Distributed Systems CS 5580F
Topics • Course requirements • Introduction • Communications • Naming • Synchronization • Consistency & Replication • Fault Tolerance • Security ©2005 D. J. Foreman
Course Requirements ©2005 D. J. Foreman
Grading, etc • Paper 50% • Presentation 15% • Programming project 35% • Scale: • 90-93.9=A- 94-100=A • 80-83.9=B- 84-86.9=B 87-89.9=B+ • 70-73.9=C- 74-76.9=C 77-79.9=C+ • 0-59=F 60-69=D ©2005 D. J. Foreman
Student Responsibilities • Read the suggested papers • Complete the required programming project • Select a topic for presentation • Write a paper • Present the paper ©2005 D. J. Foreman
Programming Project • Design & implement a distributed application • Project must be approved before beginning • Design criteria • Ignore: • System outages • Network failures (e.g., partitioning) • Consider: • Communication type (C/S, P-P) • Groups, multi-casting • Teams: see instructor ©2005 D. J. Foreman
Sample Projects • An "ATM machine" • A remote file access mechanism • A database • A calculator • A chat room • A computing "grid" (like SETI) • Or ???????? (you pick one) ©2005 D. J. Foreman
Semester Project – continued • System types: • Client/Server • Peer-Peer • Processor Pool • All must have: • Multi-threaded operation • User interface (need not be graphical) • Point(s) of control (any system shuts-down all) • Must be demonstrated before end of semester • allowed time depends on # of projects in class ©2005 D. J. Foreman
Introduction ©2005 D. J. Foreman
Introductory Papers • The papers listed in the following section, while dated in the early 1970s and 80's, form some of the foundations of contemporary implementations and research. ©2005 D. J. Foreman
Basic Concepts • Synchronizing Resources, G. Andrews, 1981 • Specification of Synchronizing Processes, K. Ramanathan, 1983 • Concepts and Notations for Concurrent Programming, G. Andrews, 1983 • Distributed Operating Systems, A. Tanenbaum, 1985 ©2005 D. J. Foreman
Programming & Concept Papers • The Temporal Semantics of Concurrent Programs, A. Pneuli, 1981 • Time, Clocks, and the Ordering of Events ina D.S., L. Lamport, 1978 • Determining the Last Process to Fail, D. Skeen, 1985 • Correctness Proofs of D. Termination Algorithms, K. Apt, 1986 • Monitors: AN OS Structuring Concept, C.A.R. Hoare, 1974 ©2005 D. J. Foreman
Actual System Papers • An Introduction to the V System, E. Berglund, 1986 • The Design of the Saguaro D.O.S., G. Andrews,1987 • XMS: A Rendezvous-based D. S. Software Architecture, N. Gammage, 1985 ©2005 D. J. Foreman
Definition of a Distributed System • A collection of functions implemented across a set of independent devices that appears to its users as a single coherent system. • Horizontal - all devices are peers • Vertical - there is a control hierarchy • Note: Devices may be computers or …. ©2005 D. J. Foreman
Some Problems (topics) • System Character codes (ASCII/EBCDIC) • Number representation (big- vs. little-Endian) • Size of an "int" (16/32/64 - bit) • Float • Shared Memory • Operational Compatibility • Thread control • System services • Process handling • Pointers & References • Objects ©2005 D. J. Foreman
Hardware Systems • Multiprocessors (modularized shared memory) • Bus traffic cache incoherence • Limited scalability (network cost/speed) • NUMA – Non-Uniform Memory Access • (some local, some shared via network) • Homogeneous Multicomputers • Private memory + switching circuits • Bus-based connection: ethernet routing • Switch-based: routing via h/w net (mesh, n-cube) • e.g., Clusters of Workstations • Heterogeneous Multicomputers • No commonality, no global view ©2005 D. J. Foreman
Software • Tightly coupled (true "Distributed System") • Maintains global view • Multiprocessor (one O/S) • Multicomputer (homogeneous only) • Loosely coupled (NOS) • Each heterogeneous computer runs its own O/S • Not much transparency as seen by applications • BUT: Local services are shared on the network ©2005 D. J. Foreman
Some Classical Distributed Systems • Locus – distributed UNIX • Lightweight kernel systems w/UNIX emulation • Amoeba – groups of threads (processor pool) • Mach – synch msg-based IPC, threads • Argus ('79) – "Guardians" for transactions • Clouds – object-based (like methods) • V-system (1984)– IPC via kernel (msgs only), true multi-threading (unavailable in UNIX '84) • Cedar - µcoded VM, IDE, concurrent execution ©2005 D. J. Foreman
Classical Systems, continued • XDFS - ca. 1977 - servers • Cambridge File Server – ca.1982 - indexes on servers used by OS to build its directory • Sun NFS – networked UNIX FS ©2005 D. J. Foreman
Open vs Closed Systems • Open • Extensible • Services external to kernel • Closed • Non-extensible • Services internal to kernel • Acts as server to remote requestors • Services are: • File system • Process mgmt • Network communications ©2005 D. J. Foreman
Atomicity • Semaphores • Unstructured – requires app to test/signal • Monitors • A programming construct • Shared variables • Mutexes • Procedures • lock and unlock functions (may be kernel-provided) • Threads (Pthreads, POSIX threads) • Butenhof, "Programming with POSIX threads", AW • Lewis & Berg, "Multithreaded Programming with Pthreads", PTR-PH ©2005 D. J. Foreman
Memory Problems • Little Endian (Intel): lowest # byte = highest value • Big Endian (Sun): highest # byte = highest value • Receiver must know data types 5 5*224 improper correction Original (Sun) as received (Windows) ©2005 D. J. Foreman
Number Representations • What is an "int" • 16 bits ??? • 32 bits ??? • FP number storage • Many mechanisms, depending on h/w • Less portable than integers ©2005 D. J. Foreman
An example using int's • 32-bit compiler, treats "int" as 32 bits int x; x=1; // x is 00000000 00000000 00000000 00000001 Y(x); //all 32 bits ---> 16-bit target system • Target of call compiled on 16-bit system function Y (int val) // "int" is 16 bits {print val;} // prints 0 ---- WHY???? --- • 'Y' only gets the 1st 16 bits of x • because it doesn't KNOW there are more to get !!! --- WHY??? -- • Same problem for 32-bit 64-bit calls and whatever comes next. Remember Y2K! ©2005 D. J. Foreman
Operational Compatibility • Thread controls • Pthreads on Windows (compatible w/POSIX) • POSIX threads • seen on Linux & Solaris • MS threads • Solaris threads • others (OS/2, mainframe) • System services (commands, signals, file manipulation) • Process handling (creation, deletion) • Communication (IPC: msgs vs pipes) ©2005 D. J. Foreman
Memory ©2005 D. J. Foreman
Shared Memory • Inhibits distribution (coherency lost, cost to build) • When NOT shared, problems arise: • Consistency • Access (speed, mechanisms) • Compatibility (endian) etc. • Atomicity • Program level • Compare & Swap • Test & OP (add, subtract, set-bit) • Kernel/library level • Semaphores, Mutexes, Condition variables ©2005 D. J. Foreman
Distributed Memory • Partial solutions for heterogeneous systems • Small local (non-shared) memory • Bus and caches to a main store • Bus and caches to each processor • Full solutions for multiprocessors • Interconnection (h/w switching) networks • Limited scalability • VERY expensive, but VERY fast ©2005 D. J. Foreman
Communications 2 ©2005 D. J. Foreman
Methods • Shared variables - coherence?? • Message passing • RPC – Remote Procedure Call • Very good for Client-server • Hides message passing • RMI – Remote Method Invocation (also Remote Object Invocation) • Like RPC, but with system-wide objects • Java server implements a Security Manager & registry • Very much like DCE • MOM – Message Oriented Middleware • Streams (TCP/IP) • Needed for continuous information flow (video, audio) ©2005 D. J. Foreman
RPC Steps 2 • Client procedure calls client stub in normal way • Client stub builds message, calls local OS • Client's OS sends message to remote OS • Remote OS gives message to server stub • Server stub unpacks parameters, calls server • Server does work, returns result to the stub • Server stub packs it in message, calls local OS • Server's OS sends message to client's OS • Client's OS gives message to client stub • Stub unpacks result, returns to client ©2005 D. J. Foreman
Passing Value Parameters 2 • Original message on the PC • The message after receipt on a SPARC WS • The message after being inverted. Numbers in dotted-boxes indicate the address of each byte ©2005 D. J. Foreman
Berkeley Sockets (1) 2 Socket Create an endpoint from a connection Bind Assign a "handle" to the socket Listen Signal availability for connection requests Accept Allow a specific connection request to an ethereal port, selected by the O/S, freeing the "listen" port Send Output data Receive Input data Close Delete the connection Basic TCP/IP 'commands' ©2005 D. J. Foreman
Berkeley Sockets: Client-Server 2 t0 • Connection-oriented communication pattern using sockets. • Note the ordering of Reads & Writes vs time. ©2005 D. J. Foreman
Berkeley Sockets: Peer-Peer 2 • Read & Write are independent threads ©2005 D. J. Foreman
Berkeley Sockets: Producer-Consumer 2 • Note the one-way nature of communication ©2005 D. J. Foreman
Naming 4 ©2005 D. J. Foreman
What is a name? • Basic objects • File • Value • Program • Data structures across a network • How to reference • How to modify ©2005 D. J. Foreman
Synchronization 5 ©2005 D. J. Foreman
Mechanisms • Explicit signals • Semaphores, conditions, events • Test and Set shared variables • Consider chip cycle vs. bus cycle ©2005 D. J. Foreman
Methods • Semaphores • Conditional critical sections • Messages • Event counts and sequencers • Monitors • Modules, common procedures and entries • Path expressions • Managers • I/O commands ©2005 D. J. Foreman
Problems • Race conditions - cause loss of data • Action sequences - hierarchy to be maintained • Data protection ©2005 D. J. Foreman
Consistency and Replication 6 ©2005 D. J. Foreman
World view • Users must see same data at all times • Updates must be controlled • Rollbacks must maintain consistency ©2005 D. J. Foreman
Fault Tolerance 7 ©2005 D. J. Foreman
Partitioning • Problem: when is the system partitioned? • Response time? • Keep-alive? • Other? • Solutions • Rollback? • Re-start? • Cut-off? • Other? ©2005 D. J. Foreman
Security 8 ©2005 D. J. Foreman
Whose data is it? • Must protect data and programs • Viruses • Worms • Errors • Theft ©2005 D. J. Foreman