270 likes | 383 Views
Linda and TupleSpaces. Prabhaker Mateti. Linda Overview. an example of Asynchronous Message Passing send never blocks (i.e., implicit infinite capacity buffering) ignores the order of send Associative abstract distributed shared memory system on heterogeneous networks
E N D
Linda and TupleSpaces PrabhakerMateti
Linda Overview • an example of Asynchronous Message Passing • send never blocks (i.e., implicit infinite capacity buffering) • ignores the order of send • Associative abstract distributed shared memory system on heterogeneous networks • http://lindaspaces.com/ Linda
Tuple Space • A tuple is an ordered list of (possibly dissimilar) items • (x, y), coordinates in a 2-d plane, both numbers • (true, ‘a’, “hello”, (x, y)), a quadruple of dissimilars • Instead of () some papers use < > • Tuple Space is a collection of tuples • Consider it a bag, not a set • Count of occurrences matters. • T # TS stands for #occurrences of T in TS • Tuples are accessed associatively • Tuples are equally accessible to all processes Linda
Linda’s Primitives • Four primitives added to a host proglang • out(T) • output T into TS • the number of T’s in TS increases by 1 • Atomic • no processes are created • eval(T) • creates a process that “evaluates” T • residual tuple is output to TS • in(T) • input T from TS • the number of T’s in TS decreases by 1 • no processes are created • more … • rd(T) abbrev of read(T) • input T from TS • the number of T’s in TS does not change • no processes are created Linda
Example: in(T) and inp(T) • Suppose multiple processes are attempting • Let T # TS stand for no. occurrences of T in TS • if T # TS ≥ 1: • input the tuple T • T # TS decreases by 1 • atomic operation • if T # TS = 1: • Only one process succeeds • Which? Unspecified; nondeterministic • if T # TS = 0: • All processes wait for some process to out(T) • may block for ever • inp(T) • a “predicated” in(T) • if T#TS = 0, inp(T) fails but the process is not blocked • if T#TS = 1, inp(T) succeeds • effect is identical to in(T) • process is not blocked • rdp(T) Linda
Example: in(“hi”, ?x, false) • x declared to be an int • the tuple pattern matches any tuple T provided: • length of T = 3 • T.1 = “hi” • T.2 is any int • T.3 = false • X is then assigned that int • Suppose TS = {| (“hi”, 2, false), (“hi”, 2, false), (“hi”, 35, false), (“hi”, 7, false), … |} • in(“hi”, ?x, false) inputs one of the above • which? unspecified • Tuple patterns may have multiple ? symbols Linda
in(N, P2, …, Pj) • N an actual arg of type Name • P2 … Pj are actual/ formal params • The values found in the matched tuple are assigned to the formals; the process then continues • The withdrawal of the matched tuple is atomic. • If multiples tuples match, non deterministic choice • If no matching tuple exists, in(…) suspends until one becomes available, and does the above. Linda
Example: eval(“i”,i, sqrt(i)) • Creates a new process(es) • to evaluate each field of eval(“i”, i, sqrt(i)) • the result is output to TS • The tuple (“i”, i, sqrt(i)) is known as an “active” tuple. • Suppose i = 4 • sqrt(i) is computed by the new process. • Resulting tuple is (“i”, 4, 2.0) • known as a passive tuple • can also be (“i”, 4, -2.0) • (“i”, 4, 2.0) is output to TS • Process(es) terminate(s). • Bindings inherited from the eval-executing process only for names cited explicitly. Linda
Example: eval("Q", f(x,y)) • Suppose eval("Q", f(x,y)) is being executed by process P0 • P0 creates two new processes, say, P1 and P2. • P1 evaluates “Q” • P2 evaluates f(x,y) • P0 moves on • P0 does not wait for P1 to terminate • P0 does not wait for P2 to terminate • P0 may later on do an in(“Q”, ?result) • P2 evaluates f(x,y) in a context where f, x and y have the same values they had in P0 • No bindings are inherited for any variables that happen to be free (i.e., global) in f, unless explicitly in the eval Linda
Linda Algorithm Design Example • Given a finite bag B of numbers, as well as the size nb of the bag B, find the second largest number in B. • Use p processes • Assume the TS is preloaded with B: • (“bi”, bi) for i: 1..nb • (“size”, nb) • Each process inputs nb/p numbers of B • Is nb % p = 0? • Each process outputs the largest and the second largest it found • A selected process considers these 2*p numbers and does as above • Result Parallel Paradigm Linda
Linda Algorithm: Second Largest intfirstAndSecond(intnx) { int bi, fst, snd; in(“bi”, ?bi); fst = snd = bi; for (inti = 1; i < nx; i++) { in(“bi”, ?bi); if (bi > fst) { snd = fst; fst = bi; } } out(“first”, fst); out(“second”, snd); return 0; } main(intargc, char *argv[]) { /* open a file, read numbers,… * out(“bi”, bi) * out(“nb”, nb) * p = … */ inti, nx = nb / p; /* Is nb % p = 0? */ for (i=0; i < p; i++) eval(firstAndSecond(nx)); /* in(“first”, fst) and * in(“second”, snd) tuples … * finish the computation */ } Linda
Arrays and Matrices • An Array • (Array Name, index fields, value) • (“V”, 14, 123.5) • (“A”, 12, 18, 5, 123.5) • That A is 3d … you know it from your design; does not follow from the tuple • Tuple elements can be tuples • (“A”, (12, 18, 5), 123.5) Linda
“Linked” Data Structures in Linda • A Binary Tree • Number the nodes: 1 .. • Number the root with 1 • Use the number 0 for nil • (“node”, nodeNumber, nodedata, leftChildNumber, rightChildNumber) • A Directed Graph • Represent it as a collection of directed edges. • Number the nodes: 1 .. • (“edge”, fromNodeNumber, toNodeNumber) Linda
More on Data Structures in Linda • Binary Tree (again) • A Lisp-like cons cell • (“C”, “cons”, [“A”, “B”]) • (“B”, “cons”, []) • An atom • (“A”, “atom”, value) • Undirected Graphs • Similar to Directed Graphs • How to ignore the “direction” in (“edge”, fromNodeNumber, toNodeNumber)? • Add (“edge”, toNodeNumberfromNodeNumber) • Or, use Set Representation. Linda
Coordinated Programming • Programming = Computation + Coordination • The term coordination refers to the process of building programs by gluing together processes. • Unix glue operation: Pipe • “Coordination is managing dependencies between activities.” • Barrier Synchronization: Each process within some group must until all processes in the group have reached a “barrier”; then all can proceed. • Set up barrier: out (“barrier”, n); • Each process does the following: in(“barrier”, ? val); out(“barrier”, val-1); rd(“barrier”, 0) Linda
serviceARequest() { int ix, cid; typeRQreq; typeRS response; … for (;;) { in (“request”, ?cid, ?ix, ?req) … out (“response”, cid, ix, response); } } a client process:: intclientid = …, rqix = 0; typeRQreq; typeRS response; … out (“request”, clientid, ++rqix, req); … in (“response”, clientid, rqix, ?response); … RPC Clients and Servers
Dining Philosophers, Readers/Wr phil(inti) { while(1) { think (); in(in"room ticket") in("chopstick", i); in("chopstick", (i+i)%Num); eat(); out("chopstick", i); out("chopstick",(i+i)%Num); out("room ticket"); } } initialize() { inti; for (i = 0; i < Num; i++) { out("chopstick", i); eval(phil(i); if (i < (Num-1)) out("room ticket"); } } startread(); … read;… stopread(); startread() { rd("rw-head", incr("rw-tail")); rd("writers", 0); incr("active-readers"); incr("rw-head"); } intincr(CounterName); { in(CounterName, ?value); out(CounterName, value + 1); return value; } /* complete the rest of the implementation of * the readers-writers */ Linda
Semaphores in Linda • Create a semaphore named xyz whose initial value is 3. • Solution: RHS • Properties: • Is it a semaphore satisfying the “weak semaphore assumption”? • Load the tuple space with • (“xyz”), (“xyz”), (“xyz”) • P(nm) { in(nm); } • V(nm) { out(nm); } Linda
Programming Paradigms • Result Parallel • focus on the “structure” of input space. • Divide this into many pieces of the same structure. • Solve each piece the same way • Combine the sub-results into a final result • Divide-and-Conquer • Hadoop • Agenda Of Activities • A list of things to do and their order • Example: Build A House • Build Walls • Frame the walls • Plumbing • Electrical Wiring • Drywalls • Doors, Windows • Build a Drive Way • Paint the House • Ensemble Of Specialists • Example: Build A House • Carpenters • Masons • Electrician • Plumbers • Painters • Master-slave Architecture • These paradigms are applicable to not only Linda but other languages and systems. Linda
Result Parallel Generate Primes /* From Linda book, Chapter 5 */ intisprime(int me) {intp,limit,ok;limit=sqrt((double)me)+1;for(p=2; p < limit;++p){rd("primes“,p,?ok);if(ok &&(me%p==0))return0;}return1; } real_main() {int count =0,i, ok;for(i=2;i<= LIMIT;++i) eval("primes",i,isprime(i));for(i=2;i<= LIMIT;++i){rd("primes",i,?ok);if(ok){ ++count; printf(“prime: %n\n”, i); } } } Linda
Paradigm: Agenda Parallelism /* From Linda book */ real_main(intargc, char *argv[]) { inteot,first_num,i,length, new_primes[GRAIN],np2; intnum,num_prices, num_workers, primes[MAX], p2[MAX]; num_workers = atoi(argv[1]); for (i = 0; i < num_workers; ++i) eval("worker", worker()); num_primes= init_primes(primes, p2); first_num= primes[num_primes-1] + 2; out("next task", first_num); eot= 0; /* Becomes 1 at end of table */ for (num= first_num; num< LIMIT; num += GRAIN){ in("result", num, ? new_primes:length); for (i = 0; i < length; ++i, ++num_primes) { primes[num_primes] = new_primes[i]; if (!eot) { np2 = new_primes[i]*new_primes)[i]; if (np2 > LIMIT) { eot= 1; np2 = -1; } out("primes", num_primes, new_primes[i], np2); } } } /* "? int" match any intandthrow out the value */ for (i = 0; i < num_workers; ++i) in("worker", ?int); printf("count: %d\n", num_primes); } worker() { int count, eot,i, limit, num, num_primes, ok,start; intmy_primes[GRAIN], primes[MAX], p2[MAX]; num_primes = init_primes(primes, p2); eot = 0; while(1) { in("next task", ? num); if (num == -1) { out("next task", -1); return; } limit = num + GRAIN; out("next task", (limit > LIMIT)? -1 : limit); if (limit > LIMIT) limit = LIMIT: start = num; for (count = 0; num < limit; num += 2) { while (!eot && num > p2[num_primes-1]) { rd("primes", num_primes, ?primes[num_primes], ?p2[num_primes]); if (p2[num_primes] < 0) eot= 1; else ++num_primes; } for (i = 1, ok = 1; i < num_primes; ++i) { if (! num % primes[i])) { ok = 0; break ; } if (num < p2[i]) break; } if (ok) {my_primes[count] = num; ++count;} } /* Send the control process any primes found. */ out("result", start, my_primes:count); } } Linda
Paradigm: Specialist Parallelism /* From Linda book */ source() { inti, out_index=0; for (i = 5; i < LIMIT; i += 2) out("seg", 3, out_index++, i); out("seg", 3, out_index, 0); } pipe_seg(prime, next, in_index) { int num, out_index=0; while(1) { in("seg", prime, in_index++, ? num); if (!num) break; if (num % prime) out("seg", next, out_index++, num); } out("seg", next, out_index, num); } sink() { intin_index=0, num, pipe_seg(), prime=3, prime_count=2; while(1) { in("seg", prime, in_index++, ?num); if (!num) break; if (num % prime) { ++prime_count; if (num*num < LIMIT) { eval("pipe seg“, pipe_seg(prime,num,in_index)); prime = num; in_index = 0 } } } printf("count: %d.\n", prime_count); } real_main() { eval("source", source()); eval("sink", sink()); } Linda
Linda Summary • out(), in(), rd(), inp(), rdp() are heavier than host language computations. • eval() is the heaviest of Linda primitives • Nondeterminism in pattern matching • Time uncoupling • Communication between time-disjoint processes • Can even send messages to self • Distributed sharing • Variables shared between disjoint processes • Many implementations permit multiple tuple spaces • No Security (no encapsulation) • Linda is not fault-tolerant • Processes are assumed to be fail-safe • Beginners do this in a loop { in(?T); if notOK(T) out(T); } No guarantee you won’t get the same T. • The following can sequentialize the processes using this code block: {in(?count); out(count+1);} • “Where most distributed languages are partially distributed in space and non-distributed in time, Linda is fully distributed in space and distributed in time as well.”
JavaSpaces and TSpaces • JavaSpaces is Linda adapted to Java • net.jini.space.JavaSpace • write(…): into a space • take(…): from a space • read(…): … • notify: Notifies a specified object when entries that match the given template are written into a space • java.sun.com/developer/technicalArticles/tools/JavaSpaces/ • Tspaces is an IBM adaptation of Linda. • “TSpaces is network middleware for the new age of ubiquitous computing.” • TSpaces = Tuple + Database + Java • write(…): into a space • take(…): from a space • read(…): … • Scan and ConsumingScan • rendezvous operator, Rhonda. • Tspaces Whiteboard • www.almaden.ibm.com/cs/TSpaces/ Linda
http://lindaspaces.com/ • NetWorkSpaces • “open-source software package that makes it easy to use clusters from within scripting languages like Matlab, Python, and R.” • Nicholas Carriero and David Gelernter, “How to Write Parallel Programs” book, MIT Press, 1992 • Tutorial on Parallel Programming with Linda Linda
CEG 730 Preferences • Assume TS is preloaded with input data in a form that is helpful. • At the end of the algorithm, • TS should have only the results • the preloaded input data is removed • Any C-program can be embedded into C-Linda • not acceptable at all • Use p processes • In general, you choose p so that “elapsed” time is minimized assuming the p processes do time-overlapped parallel computation. • Is nb % p == 0? • pad the input data space with dummy values that preserve the solutions • Let some worker processes do more • Avoid using inp() and/or rdp() because • it confuses our thinking • we can get better designs without them • A badly used inp() can produce a livelock where a plain in() would have cause a block. • Typically, we can avoid the use of inp(). • Not always. • Problem: Compute the number of elements in a bag B. Assume B is preloaded into TS. • Solution needs inp(). Linda
References • SudhirAhuja, Nicholas Carriero and David Gelernter, ``Linda and Friends,'' IEEE Computer (magazine), Vol. 19, No. 8, 26-34. www.lindaspaces.com/ has an entire book. • JavaSpaces,en.wikipedia.org/wiki/Tuple_space • Andrews, Section 10.x on Linda. Yet another prime number generator. • Jeremy R. Johnson, www.cs.drexel.edu/ ~jjohnson/2010-11/winter/cs676.htm Linda