190 likes | 400 Views
Multi-Process Programming in Linux. Nuha Elmaghrabi Michelle Clifford Mark Gibriano. Introduction. Linux provides supercomputer performance for programs that perform complex computations Four basic approaches to parallel processing: Symmetric Multi-Processor (SMP) Linux Systems
E N D
Multi-Process Programming in Linux • Nuha Elmaghrabi • Michelle Clifford • Mark Gibriano
Introduction • Linux provides supercomputer performance for programs that perform complex computations • Four basic approaches to parallel processing: • Symmetric Multi-Processor (SMP) Linux Systems • Clusters of Networked Linux Systems • Execution of Multimedia Instructions (MMX) • Attached Processors Hosted by a Linux System • Explanation of an SMP parallel computation implemented using two threads that execute in parallel
SMP Linux • Hardware supported by SMP Linux Systems • Mechanisms for sharing memory • Shared Everything Vs. Shared Something • Scheduling Issues • Resources for shared memory parallel processing • System V Shared Memory • Memory Mapped Call
SMP Linux - Hardware • Supports most Intel MPS (Multiprocessor Specification) version 1.1 or 1.4 compliant machines • Supports up to 16 processors (Pentium, Pentium MMX, Pentium II) • Allows only one processor in the kernel at a time
SMP Linux - Mechanisms of Sharing Memory • Two different models used for shared memory programming: Shared Everything and Shared Something • Shared Everything • Places all data structures in shared memory • Convert a sequential program to Shared Everything without which data needs to be accessible by other processors • Shared Something • Explicitly state which data structures are shared and which are private • Easier to predict performance, tune performance, and debug code • Scheduler Issues • To achieve best performance, the number of processes in a parallel program should equal the expected number of number of processes that can be running on different processors.
SMP Linux- Resources • System V IPC (Inter-Process Communication) • Consists of a number of system calls providing message queues, semaphores, and a shared memory mechanism • Changes made by one processor will automatically be visible to all processes • Memory Map Call • Using a system call to allow a portion of a file to be mapped into user memory • Uses virtual memory paging mechanisms to cause updates
Linux Clusters • Advantages • claim "wasted cycles" • ability to scale to very large systems • possible heterogeneous machines • Disadvantages • Ten to 1000 times slower than SMP • Lack of Software for Cluster as a single system
Network Hardware • ATM • Ethernet • Fast Ethernet • Gigabyte Ethernet • Myrinet • SCSI • USB • WAPERS
Software Support • Cluster Advantages • Free • Available Source Code • More reliable and flexible • Network Software Interface • Sockets - TCP or UDP • Device driver support for PCI Ethernet and Fast Ethernet chips (July 1999) • User-level libraries avoid overhead (OS?)
Message Passing Libraries • Parallel Virtual Machine (PVM) • Heterogeneous Linux Machine Clusters linked by socket-cable networks • Parallel Job Control • Adds overhead to standard socket operations • Message Passing Interface (MPI) • Assumes homogeneous cluster • Freely available for Linux Clusters • LAM (Local Area Multicomputer) • MPICH(MPI CHameleon) • AFMPI (Aggregate Function MPI)
Linux Research Groups • Beowulf • CESDES operated by NASA • Focuses on Software production • Shrimp • construct high-performance servers • Virtual Memory mapped communication • Parallel Processing Using Linux • PAPERS/WAPERS Project • 386, 486, Pentium Clusters with Papers Networks
SIMD Within A Register (SWAR) • Partitions register into multiple integer fields • Register-width operations are used to perform SIMD-parallel computations across fields • Multimedia push brought about 2x to 8x speedup offered by SWAR techniques
What Is SWAR Good For? • Integers only (preferably small) • SIMD (Single Instruction stream, Multiple Data Stream) or vector-style parallelism • Localized, regular, memory reference patterns
SWAR Programming • Operations on word-length registers are used to speed-up computations • Polymorphic operations used so functions are unaffected by field types • Partitioned operations used to cut carry/borrow interactions between fields when doing arithmetic • Results in highest performance, but must be supported by the hardware • Restrictions must be place on filed size which can cause algorithms to be non-portable
SWAR Programming (cont.) • Regular instructions can be used to perform operations with carry/borrow across fields • Must correct for undesired field interactions • Software approach introduces overhead, but completely works with general field partitioning • SWAR programs using this approach are fully portable when using language like C • Controlling field values can be computationally more efficient by using ordinary arithmetic instructions • Best method to use for SWAR operations is method that yields the best speedup
Linux-Hosted Attached Processors • Attached parallel computing system on host Linux system produces high performance and low cost • Specialized to perform specific types of functions • Linux PC is one of few platforms suited well for this type of use
Linux PC Is A Good Host • There are two primary reasons PCs make a good host • Cheap and easy expansion capability where resources such as memory, disks, networks, etc, are easily added to a PC • Ease of interfacing • Bus prototyping cards are widely available • Parallel port offers reasonable performance in non-invasive interface
Linux Is A Good Host Operating System • Free accessibility of all source code, and extensive "hacking" guides • Linux provides good near-real-time scheduling • Supports development tools that were written to run under MSDOS or Windows while providing a full UNIX environment
Conclusion • Linux is useful tool for parallel processing and complex computing • Multiple approaches to parallel processing provide many options for parallel execution of instructions and computations • Parallel processing used provides significant speed-up of programs • Many methods suitable to your computing needs, support needs, as well as performance needs