200 likes | 375 Views
POSH Python Object Sharing. Steffen Viken Valvåg In collaboration with Kjetil Jacobsen & Åge Kvalnes. University of Tromsø, Norway Sponsored by Fast Search & Transfer. Test.py. for x in "TEST PROGRAM": if x not in "FORGET IT": print x,. Test.pyc. 0 SETUP_LOOP
E N D
POSHPython Object Sharing Steffen Viken Valvåg In collaboration with Kjetil Jacobsen & Åge Kvalnes University of Tromsø, Norway Sponsored by Fast Search & Transfer
Test.py for x in "TEST PROGRAM": if x not in "FORGET IT": print x, Test.pyc 0 SETUP_LOOP 3 LOAD_CONST ('TEST PROGRAM') 6 GET_ITER 7 FOR_ITER 10 STORE_FAST (x) 13 LOAD_FAST (x) 16 LOAD_CONST ('FORGET IT') 19 COMPARE_OP (not in) 22 JUMP_IF_FALSE (to 33) 25 POP_TOP 26 LOAD_FAST (x) 29 PRINT_ITEM 30 JUMP_FORWARD (to 34) 33 POP_TOP 34 JUMP_ABSOLUTE 37 POP_BLOCK Byte-code compilation Output Interpretation SPAM Python Execution Model
GIL Python Threading Model Thread A Thread B Byte codes • Each thread executes a separate sequence of byte codes • All threads must contend for one global interpreter lock
Example: Matrix Multiplication • Performs a matrix multiplication A = B x C • The work is split between several worker threads • The application runs on a machine with 8 CPUs • Threads do not scale for multiple CPUs due to lock contention on the GIL
Workaround: Processes Process A Process B IPC • Each process has its own interpreter lock • Requires inter-process communication, using e.g. message passing by means of pipes
Matrix Multiplication using Processes • A master process distributes the input matrices to a set of worker processes • Each worker process computes some part of the output matrix, and returns its result to the master • The master process assembles the final result matrix • More communication, and more complex pattern than using threads
Ways Ahead • Communication through standard Python container objects favors threads • Scalability on multiprocessor architectures favors processes • The GIL is here to stay, so making threads scale better is hard • However, there might be room for improvement of inter-process communication mechanisms
Using Shared Memory for IPC Process B Process A Shared Memory • Processes communicate by accessing a shared memory region • Requires explicit synchronization and data marshalling, imposes a flat data structure
X X.method1() L L.extend([X, Y]) Y Using POSH for IPC Process B Process A Shared Memory • Allocates regular Python objects in shared memory • Shared objects are accessed transparently through regular method calls • IPC is done by modifying shared, mutable objects
Complications • Processes must synchronize their access to shared, mutable objects (just like threads) • Explicit synchronization of critical regions must be possible, while implicit synchronization upon accessing shared objects is desireable • Python’s regular garbage collection algorithm is inadequate for shared objects, which may be referenced by multiple processes
X.method2() return value Proxy Objects X.method1() X return value Proxy Object Shared Object • Provides transparent access to a shared object by forwarding all attribute accesses and method calls • Provides a single entry point to a shared object, where synchronization policies may be enforced
Multi-Process Garbage Collection • Must account for references from all live processes • Must stay up-to-date when processes fork, as this may create new references to shared objects • Should be able to handle abnormal process termination without leaking shared objects
Proxy Object Garbage Collection in POSH Process A Shared Memory Process B X M Y L Shared Object Regular Python reference Reference from a process to a shared object (type I) Reference from one shared object to another (type II)
Garbage Collection Details • POSH creates at most one proxy object per process for any given shared object • Shared objects are always referenced through their proxy objects • A bitmap in each shared object records the processes that have a corresponding proxy object. This tracks references of type I (from a process) • A separate count in each shared object records the number of references to the object from other shared objects. This tracks references of type II • Shared objects are deleted when there are no references to them of either type
Performance • Performing a matrix multiplication A = B x C using POSH • The work is split between several worker processes • The application runs on a machine with 8 CPUs • More overhead, but scales for multiple CPUs
Summary • Python uses a global interpreter lock (GIL) to serialize execution of byte codes • This entails a lack of scalability on multiprocessor architectures for CPU-intensive multi-threaded apps • However, threads offer an attractive programming model, with implicit communication • Processes + shared memory reduce IPC overheads, but normally impose flat data structures and require data marshalling • POSH uses processes + shared memory to offer a programming model similar to threads, with the scalability of processes
Availability • Open source, hosted at SourceForge • http://poshmodule.sf.net/ • Still not very stable • Developers wanted
Example Usage import posh class Stuff(object): pass posh.allow_sharing(Stuff, posh.generic_init) mystuff = posh.share(Stuff()) def worker1(): mystuff.money = 0 def worker2(): mystuff.debt = 100000 def worker3(): mystuff.balance = mystuff.money - mystuff.debt for w in worker1, worker2, worker3: posh.forkcall(w) posh.waitall()