1 / 35

CIS5930 Internet Computing

Learn about Linda, a versatile parallel programming model developed in the 80's. Understand its implementation and usage in distributed computing, and explore its primitives and task farming applications.

jenniferc
Download Presentation

CIS5930 Internet Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CIS5930Internet Computing Parallel Programming Models for the Web Prof. Robert van Engelen

  2. Overview • Parallel programming models related to Internet Computing • Linda • Message passing • ORB • CSP • Pi calculus CIS 5930 Fall 2006

  3. Linda • Linda is a concurrent programming model (not a programming language) • Proposed by David Galernter and Nicholas Carriero, mid-80’s at Yale • Supports distributed computing independent of programming language and platforms used • Attracted recent interest by Semantic Web (Web spaces) and ubiquitous computing communities • Linda is implemented in • Java (JavaSpaces, TSpaces, GigaSpaces) • Java+XML (XMLSpaces) • C/C++ • Prolog • Ruby • Python • … CIS 5930 Fall 2006

  4. Linda • Linda provides uncoupling in both time and space using a tuple space • Tuple space is shared memory containing data tuples accessible to all processes • Pattern matching is used to read and extract tuples from the tuple space that match a tuple signature (aka anti-tuple) • By contrast, communications based on synchronous rendezvous couples processes in time and space • Time: synchronous rendezvous assumes that both processes must exist simultaneously • Space: process identification in a global namespace is needed to establish communications CIS 5930 Fall 2006

  5. Linda • The primitives of Linda are: • Output(T) add the tuple T to tuple space • Input(T) remove a matching tuple T from the tuple space; if there is no matching tuple then suspend until one does match • Read(T) is similar to Input(T), but does not remove the tuple from TS • Try_input(T) is a non-blocking version of Input(T) • Try_read(T) is a non-blocking version of Read(T) • Tuples may contain variables, e.g. ?x • Variables will be set when a matching tuple is read from the TS • Compare this to a function’s actual and formal parameters • Examples: • Output(“point”, 12, 67) • Input(“point”, ?x, ?y), which is also called an anti-tuple and matches the tuple (“point”, 12, 67) => x=12, y=67 • Some implementations also handle types in anti-tuples: • Input(s:string, x:integer, 67) matches (“point”, 12, 67) => s=“point”, x=12 CIS 5930 Fall 2006

  6. Linda • TS implementations • Centralized, where a central server maintains TS • Hashing to allocate tuples on particular processors • Partitioned, where tuples with a common structure are allocated on particular processors • Distributed, where any tuple may reside on any processor • The TS implementation hides the underlying complexity • Pro: transparency • Cons: efficiency (users may know where the best place is to put tuples) CIS 5930 Fall 2006

  7. Task Farming with Linda • A master process repeatedly generates a request that is served by a specific worker process or any process that matches the tuple • Tuples have • The name of the job request, e.g. the work to be performed • Integer Job ID • A name to identify the worker’s process ID • Master: • Output(“job”, 17, 24) send request to worker process 24 • Output(“job”, 18, ?) send request to any worker process • Workers: • Input(“job”, ?id, 12) this worker process only handles “job” tasks • Input(?job, ?id, 24) this worker process handles any task CIS 5930 Fall 2006

  8. Matrix Multiplication with Linda • Generate the matrices A and B: • Output(“A”, 1, (1,2,3)) • Output(“A”, 1, (4,5,6)) • Output(“A”, 1, (7,8,9)) • Output(“B”, 1, (1,0,2)) • Output(“B”, 1, (1,2,1)) • Output(“B”, 1, (1,1,1)) • Add a job counter to iterate over each output matrix’ elements 1..9 • Output(“Next”, 1) CIS 5930 Fall 2006

  9. Matrix Multiplication with Linda • Code for the workers: • loop Input(“Next”, ?elt) Output(“Next”, elt+1) if (elt > 9) then exit row = (elt-1)/3 + 1 col = (elt-1)%3 + 1 Read(“A”, row, ?v1) Read(“B”, col, ?v2) x = dot_product(v1, v2) Output(“C”, row, col, x)end loop CIS 5930 Fall 2006

  10. Message Passing • Message passing libraries • PVM (parallel virtual machine) • MPI (message passing interface), de facto standard • Message passing operations • Point-to-point • Global/collective communications such as barriers, broadcast, and reductions • Goals/features • Low-level APIs • Parallelism expressed explicitly • Process-oriented instead of processor-oriented, i.e. multiple processes can be mapped to single processor • Intended for distributed memory processing, but also works with shared memory • Distributed programs that utilize message passing are typically written using the SPMD model (single program, multiple data) CIS 5930 Fall 2006

  11. PVM • PVM message passing library • Process creation (portable) • Point-to-point messaging • Barrier synchronization • Reductions • PVM environment runs PVMD per machine • All processes communicate through the PVMD • Many applications per PVMD CIS 5930 Fall 2006

  12. PVM • Sends must be initialized to clear the buffer and specify the encoding • pvm_initsend(enc) • Encodings are PvmDataDefault (XDR), PvmDataRaw, PvmDataInPlace • Messages are (un)packed before send (after recv) • pvm_pk{int,char,float}(*var, count, stride) • pvm_unpk{int,char,float}(*var, count, stride) • Messages are tagged and sent/received (blocking) • pvm_send(tid, tag) • pvm_recv(tid, tag) • User-defined tags are used to identify messages • Process ids are used to identify senders and receivers CIS 5930 Fall 2006

  13. PVM • Group creation • Group is a collection of processes • Processes join with pvm_joingroup(“groupname”) • Processes in the group have unique ids: pvm_gettid(“groupname”) • Group synchronization • pvm_barrier(“groupname”, count) • Non-blocking reduction operations • pvm_reduce(func, data, count, type, tag, “group”, rootinst) • Result is send to rootinst process CIS 5930 Fall 2006

  14. PVM • Process creation withpvm_spawn(task, argv, flag, loc, ntask, tids) • flag and group determine where tasks are started • #ntask processes spawned • tids[] is populated with the created process ids • Process ends with pvm_exit() • Ends PVM operations, does not kill process • Other • pvm_mytid() gives process id • pvm_parent() gives parent’s id (the spawner) CIS 5930 Fall 2006

  15. PVM Example int main(int argc, char **argv) { int myGroupNum; int friendTid; int mytid; int tids[2]; int message[MESSAGESIZE]; int c,i,okSpawn; /* Initialize process and spawn if necessary */ myGroupNum=pvm_joingroup("ping-pong"); mytid=pvm_mytid(); if (myGroupNum==0) { /* I am the first process */ pvm_catchout(stdout); okSpawn=pvm_spawn(MYNAME,argv,0,"",1,&friendTid); if (okSpawn!=1) { printf("Can't spawn a copy of myself!\n"); pvm_exit(); exit(1); } tids[0]=mytid; tids[1]=friendTid; } else { /*I am the second process */ friendTid=pvm_parent(); tids[0]=friendTid; tids[1]=mytid; } pvm_barrier("ping-pong",2); if (myGroupNum==0) { /* Initialize the message */ for (i=0 ; i<MESSAGESIZE ; i++) { message[i]='1'; } } /* Now start passing the message back and forth */ for (i=0 ; i<ITERATIONS ; i++) { if (myGroupNum==0) { pvm_initsend(PvmDataDefault); pvm_pkint(message,MESSAGESIZE,1); pvm_send(friendTid,msgid); pvm_recv(friendTid,msgid); pvm_upkint(message,MESSAGESIZE,1); } else { pvm_recv(friendTid,msgid); pvm_upkint(message,MESSAGESIZE,1); pvm_initsend(PvmDataDefault); pvm_pkint(message,MESSAGESIZE,1); pvm_send(friendTid,msgid); } } pvm_exit(); exit(0); } CIS 5930 Fall 2006

  16. MPI • MPI is a standardized message passing interface based on earlier libraries • Two versions: MPI-1 and MPI-2 • Features • Portable • (Virtual) processor topologies (cubes, torus, hypercube, …) • Copy-free messaging (no packaging overhead) • Process creation is implicit • Point-to-point messaging (blocking/non-blocking) • Group/collective communications • Barriers, broadcast, reductions, and scans • Buffering, but send may block until receive is called • Message delivery order guaranteed (no tagging needed) CIS 5930 Fall 2006

  17. MPI • MPI communicators generalizes the notion of process groups • Communicators are named • Processes within a communicator are numbered (ranked) and can be named • MPI_COMM_WORLD is the global communicator with which each program starts • New communicators can be split off from others and duplicated • Queries to determine size of communicator, the ranking of the current process in the communicator etc:MPI_Comm_size(comm, *size)MPI_Comm_rank(comm, *rank) CIS 5930 Fall 2006

  18. MPI • Initialization and finalization • MPI_Init(argc, argv) • MPI_Finalize() • Messages • MPI_Send(message, size, type, rank, tag, comm) • MPI_Recv(message, size, type, rank, tag, comm, *status) • Types are MPI_CHAR, MPI_INT, etc. • Also polling to check if operation is finished • Send: must not change the message while operation is in progress • Recv: must no use data until operation has completed • Collective • MPI_Barrier(comm) • MPI_BCAST(buffer, count, type, rank, comm) • MPI_REDUCE(sendbuf, recvbuf, count, type, op, rank, comm) CIS 5930 Fall 2006

  19. MPI Example #include “mpi.h”int main(int argc, char **argv) { int myrank, friendRank; int message[MESSAGESIZE]; int i, tag=MSG_TAG; MPI_Status status; /* Initialize, no spawning necessary */ MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD,&myrank); if (myrank==0) { /* I am the first process */ friendRank = 1; } else { /*I am the second process */ friendRank=0; } MPI_Barrier(MPI_COMM_WORLD); if (myrank==0) { /* Initialize the message */ for (i=0 ; i<MESSAGESIZE ; i++) { message[i]='1'; } } /* Now start passing the message back and forth */ for (i=0 ; i<ITERATIONS ; i++) { if (myrank==0) { MPI_Send(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD); MPI_Recv(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD, &status); } else { MPI_Recv(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD, &status); MPI_Send(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD); } } MPI_Finalize(); exit(0); } CIS 5930 Fall 2006

  20. ORB • Object request broker (ORB) • Middleware software that allows programmers to make program calls from one computer to another • Handle the transformation of in-process data structures to and from the byte sequence which is transmitted over the network • This is called marshalling, which uses a form of object serialization • ORBs • CORBA: Common Object Request Broker Architecture, a standard defined by the Object Management Group (OMG) • DCOM CIS 5930 Fall 2006

  21. CORBA • Interface definition language (IDL) to specify the interfaces • Internet InterORB Protocol (IIOP) is a protocol for communication between CORBA ORBs • Marshalls objects in IIOP serialization format • Objects are exchanged by value • Several versions that work over for SSL and HTTP, etc. • Implementations for C++, Java, .NET, Python, Perl, … CIS 5930 Fall 2006

  22. DCOM • Distributed component object model (DCOM) by Microsoft is similar to CORBA • DCE/RPC is the underlying RPC mechanism • DCE/RPC has strictly defined rules regarding marshalling and who is responsible for freeing memory • Marshalling: serializing and deserializing the arguments and return values of method calls "over the wire" • Distributed garbage collection: ensuring that references held by clients of interfaces are released when, for example, the client process crashed, or the network connection was lost CIS 5930 Fall 2006

  23.  Calculus • Concurrency and communication are fundamental aspects of distributed systems • The  Calculus generalizes earlier process calculi • Fresh channel names can be dynamically created and communicated (can express mobility) • Replication of processes and parallel composition of processes • Nondeterminism • Turing complete • The  Calculus models communication and processes • Communication is on named channels c • Channels are created within a local scope of a set of communicating processes • Processes synchronously send and receive messages over channels, including the sending of named channels CIS 5930 Fall 2006

  24.  Calculus Syntax Create new channel c with local scope restricted to P, also sometimes written Output v on channel c Output v on channel c, then continue with Pwhen output was received Input w from channel c and hand to process P,where w has a local scope restricted to P Parallel composition of P and Q The nil process (stop execution) CIS 5930 Fall 2006

  25.  Calculus Example Server has access to printer, client requests access Server sends access link a to client (received as c),client sends data d over a to printer CIS 5930 Fall 2006

  26.  Calculus Semantics Examples Sending a along channel x, where {a/u}(…) denotes substitution of u by a in … = Nondeterministic send with two writers, one of which will succeed: Nondeterministic receive with two readers, one of which will succeed: CIS 5930 Fall 2006

  27.  Calculus Reductions COM PAR if RES if CIS 5930 Fall 2006

  28.  Calculus Equivalences Structural congruence  satisfies the axioms: and STRUCT if and CIS 5930 Fall 2006

  29.  Calculus Reduction Examples Sending a channel y to another process that uses it to output c: Reduction with scope extrusion: CIS 5930 Fall 2006

  30.  Calculus Extensions Nondeterministic choice operator: P+Q A test for name equality: [x=y]P means proceed with P only if x=y Replication: !P reproduces infinite copies of P running in parallel Polyadic  calculus (multiple values): Note: Higher-order processes: sending processes over channels: CIS 5930 Fall 2006

  31.  Calculus Applications • Modeling business processes • Business Process Modeling Language (BPML) • Business Process Execution Language for Web services (BPEL4WS) extends the Web Services interaction model and enables it to support business transactions • Modeling systems of autonomous (mobile) agents • Modeling cellular telephone networks • Spi-calculus: a formal notation for describing and reasoning about cryptographic protocols • Bioinformatics: cellular signaling pathway (RTK/MAPK cascade) in an extension of the -calculus • Implementations • The Pict programming language • Nomadic Pict: programming for mobile computations CIS 5930 Fall 2006

  32. CSP • Communicating Sequential Processes (CSP) is a notation first introduced by C.A.R. Hoare • CSP is a process calculus to model communicating processes (sequential+parallel) • CSP was influential in the development of the occam programming language • CSP provides two classes of primitives in its process algebra: • Events – represent communications or interactions; assumed to be indivisible and instantaneous • May be atomic names (e.g. on, off), compound names (e.g. valve.open, valve.close), or input/output events (e.g. mouse?xy, screen!bitmap) • Primitive processes – represent fundamental behaviors • Examples include STOP (the process that communicates nothing, also called deadlock), and SKIP (which represents successful termination) CIS 5930 Fall 2006

  33. CSP Syntax • Prefix • Deterministic choice • Nondeterministic choice • Interleaving CIS 5930 Fall 2006

  34. CSP Introduction • See Bill Roscoe’s lecture notes on CSP:http://web.comlab.ox.ac.uk/oucl/publications/books/concurrency/slides/ CIS 5930 Fall 2006

  35. CSP Applications • Software design • Hardware design • Security protocol analysis • Implementations/tools • Failures/Divergence Refinement (FDR) • ProBE • ARC CIS 5930 Fall 2006

More Related