310 likes | 387 Views
Object-Based Distributed Shared Memory. Arnshea Clayton CSC 8320 Advanced Operating Systems. Overview. Objects defined. Object-based memory organization. Design considerations. Advantages vs. Disadvantages. Implementation: Linda Implementation: Orca Implementation: Enterprise JavaBeans
E N D
Object-Based Distributed Shared Memory Arnshea Clayton CSC 8320 Advanced Operating Systems
Overview • Objects defined. • Object-based memory organization. • Design considerations. • Advantages vs. Disadvantages. • Implementation: Linda • Implementation: Orca • Implementation: Enterprise JavaBeans • Future Directions
Objects defined • Objects are abstract data types defined by the programmer. • Objects encapsulate both data and operations on that data. • Data stores object state. • Operations called methods. • Object-Oriented design allows access to state only via methods. • Object-Oriented design is the most popular methodology in modern software engineering. • Objects are the fundamental unit of abstraction in object oriented design.
Object-based memory organization • Global address space abstracted as a collection of objects. • Address locations in global address space are not directly accessible. • Objects in collection, which is stored in the global address space, accessible to all processes. • Object methods can be invoked from any process.
Design considerations • Non-replicated scenario: • Lower bandwidth requirement in multicomputer scenario. • Single point of failure. • Potential performance bottleneck. • Object migration can reduce the impact of performance bottlenecks by moving object closer to invoking process.
Design considerations • Replicated scenario: • Memory consistency during updates. • Update all copies. • High communication overhead during updates. • Highest performance since all method invocations are local. • Invalidate old copies. • Lower communication overhead during update. • Potentially lower performance during access. • Can be mitigated by replicating current copy on demand.
Advantages vs. Disadvantages • Advantages to using Object-based distributed shared memory over other distributed shared memory paradigms: • Modularity. Encapsulation of data and operations into a single entity decreases coupling with respect to other modules. • Flexibility. Information hiding (data access through methods) centralizes control over data access. This makes it simpler and less error prone to modify an object. It also presents many opportunities for optimization. • Synchronization and access can be integrated together cleanly.
Advantages vs. Disadvantages • Disadvantages to using Object-based distributed shared memory over other distributed shared memory paradigms: • More difficult to incorporate into pre-existing programs that assume direct access to linear globally shared address space. • Performance penalty for data access since data access mediated through methods.
Implementation: Linda • Provides processes (possibly on multiple machines) with highly structured distributed shared memory (DSM). • Access to DSM only possible through a small set of primitives that must be incorporated into the programming language. • Includes small syntax extension. • Linda libraries, pre-compiler and runtime environment constitute entire DSM system. • Pre-compiler necessary to convert syntax extension into library calls useable by native language. • Linda’s small number of primitives simplifies incorporation of Linda into any existing programming language. • Reduces learning curve for programmers.
Linda: Memory Structure • Tuple: An ordered collection of elements. • (a, b, c, d) is a 4-tuple. (b, c, d, a) is a different 4-tuple composed of the same four elements in a different order. • DSM in Linda is a collection of tuples of varying sizes and types known as the tuple space. • Each tuple is distinguished by its size, the type and order of its elements and, ultimately, the value of its elements.
Linda: Memory Structure • By convention the first element in a tuple is typically a string. This string can be considered the “name” of the tuple. • (“abc”, 2, 5) • (“matrix-1”, 1, 6, 3.14) • (“family”, “is-sister”, “Carolyn”, “Elinor”)
Linda: Operations • Access to collection of tuples provided by four primitives: in, out, read and eval. • out – Writes the specified tuple to the tuple space. • out(“abc”, 1, 2) writes a 3-tuple to tuple space. • in – Reads (and removes) a tuple from the tuple space. • Content-based addressing. Memory identified by template. • Read and removal are atomic. If two processes compete for the same tuple only one will win; the other will block until a tuple satisfying the template becomes available. • in(“abc”, 1, ? i) searches the tuple space for a 3-tuple with “abc” as the first element and 1 as the second element. If a tuple matching this template is found it is removed from the tuple space and the value of its 3rd element is assigned to variable i.
Linda: Operations • read – Reads (but does not remove) a tuple from the DSM. • read(“abc”, 1, ? i) does the same thing as in from the previous example but does not remove the tuple from the tuple space. • eval – Evaluates parameters in parallel and deposits resulting tuples into DSM. This is how parallel processes are created in Linda.
Linda: Operations • Linda is not a fully-fledged object-based shared memory. • Small set of primitives do not allow definition of user defined operations on objects (tuples). • Linda’s memory organization is more structured than page-based or shared-variable shared memory and is therefore considered a kind of object-based shared memory.
Linda: Subspace Organization • in() and read() operations require search through tuple space to locate matching tuples. • 2 optimizations employed to reduce search time: • Partition by signature: all tuples with the same signature (type, number and order of elements) stored in same partition. • e.g., (“abc”, 1, 3.14) stored in same partition as (“bcd”, 2, 2.78) but in different partition from (“abc”, 1, 3.14, 2.78). • Sub-partition by first element. • e.g., (“abc”, 1, 3.14) stored in same sub-partition as (“abc”, 2, 2.78) but in different sub-partition than (“bcd”, 2, 2.78).
Linda: Multiprocessor Implementation • Multiprocessor with shared memory (e.g., direct access, bus, ring, etc…) architecture supports simplest Linda implementation. • Entire collection of tuples stored, by subspace partition/sub-partition, in shared memory. • Access to collection synchronized using OS/hardware provided locking.
Linda: Multicomputer Implementations • Multicomputer scenario, where no single shared memory is available, requires different implementation. • Implementation 1- Local in()/Global out(): • Every machine replicates entire tuple space. • Writes (out(), eval()) done to every replica. • Reads (in(), read()) use local tuple space. • Since in() requires delete a message must be broadcast to all other processes to delete the tuple from all replicas. • Implementation 1 is simple but high communication overhead due to broadcast requirement of write may reduce scalability for large collections with large number of machines. • S/Net Linda system (Carriero and Gelernter, 1986) uses this system.
Linda: Multicomputer Implementations • Implementation 2 - Global in()/Local out(): • Every machine maintains partial collection. • Writes (out()) performed locally (only local collection modified). • Reads (in(), read()) performed by broadcast. • Processes with matching tuple respond with the matching tuple and, in the case of in(), delete the tuple from their local collection. • Excess responses stored by requester in local collection (effectively treated like a local out()). • Request repeated over increasing interval until response received. • Implementation 2 benefits from a lower memory requirement (each process only maintains a partial collection) than Implementation 1.
Linda: Multicomputer Implementations • Implementation 3 – Partial replication: • All processes are logically organized into a rectangular grid-like network. • Writes (out(), eval()) are written to all machines in the sender’s row. • Read (in(), read()) requests are broadcast to all machines in the reader’s column. Since the grid is rectangular there is always exactly 1 process guaranteed to have the tuple if any has it (the intersection of the written row and the read column). • Implementation 3 combines the best attributes of Implementations 1 and 2: Lower communication overhead (all broadcasts are partial) and lower memory requirement (each machine only maintains a partial replica of the tuple space).
Linda: Multicomputer Implementations • Implementation 4: No-broadcast solution. • Two part tuple space partitioning: by signature then by first element value. • Tuple server assigns each partition to a processor. • Read and Write requests go through the Tuple server. • Implementation 4 is ideal for communication constrained environments (e.g., widely distributed processors, processors distributed under extreme conditions, etc…).
Implementation: Orca • Orca (Bal, et al, 1990, 1992) is a more general object-based shared memory than Linda because it allows objects to contain user defined operations. • Orca is comprised of a language, compiler and runtime system.
Orca: Language • Orca’s syntax is loosely based on Modula-2. • Each statement is a pair made up of a guard (non-side effect boolean expression) and a block of statements. • An operation is comprised of a set of statements. • The guard for each statement is evaluated in an unspecified order. Once a guard evaluates to true the corresponding block is executed.
Orca: Distributed Shared Memory • Objects become distributed in Orca by use of fork() operation. • fork(data) procedure; starts a separate process on the specified processor. The new process executes the specified procedure. • The data parameter is now shared between the invoking process and the newly created process. • Example: distributing an object to all processes • for i in 1 .. n do fork foobar(s) on i; od; • Each newly created process will have access to object s and will begin execution in procedure foobar.
Orca: Synchronization • Orca guarantees that each object method call is atomic. • Each method call is effectively a critical region the execution of which can only be undertaken by a single process at a time. • This limits the practical size of Object methods in Orca. • Multiple simultaneous calls to the same method are processed as if each were executed one after another in sequence. • The evaluation of guards in Orca provides condition synchronization • Parallel programs often need to restrict the execution of a set of instructions based on conditions that only apply to a subset of processes.
Orca: Memory Consistency • The Orca runtime can maintain a single copy of an object or replicate it across every process that uses it. Objects can be migrated between these 2 states. • Memory accesses (via method calls) can be read (read-only) or write (read/write). • For local single copy objects operations are performed locally (lock then read and/or write). • For remote single copy objects operations are performed via RPC (remote lock then read and/or write).
Orca: Memory Consistency • For replicated objects method invocations vary based on whether the object is modified (considered a write). • Read (non-modifying calls) on replicated objects are handled locally. • Writes (modifying calls) on replicated objects require updates to all replicated copies: • If reliable (error corrected), totally ordered broadcasting is available a blocking broadcast is used to update all copies (the sender blocks until all replicas acknowledge the update). • If broadcasting is unreliable, or broadcasting is unavailable, a two-phase commit protocol is used: • A lock on the primary copy is obtained. • The primary copy is updated. • The primary copy locks each remaining copy. • When all remaining copies are locked, the primary copy sends the update. • The primary copy unlocks the remaining copies. • The primary copy acknowledges the update and unlocks the primary copy. • NOTE: These locks are exclusive: reads and writes block until the locks are released.
Orca: Memory Consistency • Decision to replicate an object based on compiler estimate of read to write ratio: • Higher read to write ratio increases likelihood of replication. • The estimate is updated by the runtime on each fork() operation. • Object Migration is a side effect of all runtimes using the same algorithm to determine whether or not an object should be replicated
Object-based Distributed Shared Memory • Less Error prone than other forms of Distributed Shared Memory since synchronization is implicit. • The tradeoff is in programming complexity. • Programs that expect direct memory access have to be rewritten entirely. • In the case of Orca an entirely new language must be learned.
Implementation: Enterprise JavaBeans • Enterprise JavaBeans (EJBs). • Fully object oriented (including inheritance). • Both single-copy and replicated objects supported. Replication is handled by the runtime (called an Application Server). Replication algorithms vary from vendor to vendor. • Detailed Security and Persistence model. • Requires no new syntax in the java language. • Objects distinguished by whether or not they are stateless (Entity beans) or stateful (Session beans). • EJB 2.0 introduced Message oriented beans which persist until delivery and provide simplified model for asynchronous notification.
Future Directions • Web-services • Coalescing around stateless distributed shared objects. • Stateless distributed shared objects allow greater scalability. • Industry standardization on protocols to support interoperability. • Security • Increased granularity, down to the level of individual method calls.
References • Andrew S. Tanenbaum, Distributed Operating Systems, Amsterdam, The Netherlands: Pearson Education, 1995, pp. 356 – 371 • N Carriero, D Gelernter, “The S/Net’s Linda Kernel”, ACM Transactions on Computer Systems, vol. 4, pp. 110-129, May 1986 • V Krishnaswamy, “A Language Based Architecture for Parallel Computing” PhD Thesis, Yale Univ, 1991 • Linda G. DiMichiel, Enterprise JavaBeans Specification v2.1, 2003, http://java.sun.com/products/ejb/docs.html