260 likes | 489 Views
Aspen -- A language for highly concurrent network applications. Gautam Upadhyaya, Vijay S. Pai, Samuel P. Midkiff Purdue University. Motivation. Writing parallel programs is hard Sequential programs are often bug ridden and over schedule Parallel programming is harder Many loci of control
E N D
Aspen -- A language for highly concurrent network applications Gautam Upadhyaya, Vijay S. Pai, Samuel P. Midkiff Purdue University
Motivation • Writing parallel programs is hard • Sequential programs are often bug ridden and over schedule • Parallel programming is harder • Many loci of control • Weak programming models and language support • Most programmers have little experience in parallel programming • Most programmers have very little experience in programming for performance! • HPC and gaming programmers are the exception • Multicore processors are fast becoming the norm • The long-promised “era of ubiquitous parallelism” is finally here • And we are still unprepared • Programmers will need to worry about performance (and parallelism) for the forseeable future
Motivation • Current sequential programming models are sophisticated • High levels of abstraction • Encapsulation within well defined interfaces • Current parallel programming models are not • No encapsulation • Values can change mid-statement in shared memory • Programmers need to understand • Parallel structure of the entire program • Details of how their code interacts with other code • Details of data structure internals for message passing • Details of communication patterns for message passing • Too much programmer knowledge required • Low-level resource management • Synchronization
Aspen’s contributions • Factor out parallel specification from sequential logic • Programmer skills can be prioritized and reused • Code reuse becomes easier • Handles resource allocation automatically and dynamically • Thread management • Sockets • Communication buffers • Relies on messaging that is part of the language • Platform independent: programmer does not “see” the architecture • Can involve compiler optimizations (esp. on shared memory platforms) • Can take advantage of future architectural changes
What Aspen is… • A high-level language (and runtime system) that currently targets parallel network service applications • Aspen programs are composed structurally as a task flowchart, with individual nodes connected via explicit communication channels (queues). • Nodes communicate via messaging – no shared-memory semantics and no (user-visible) synchronization. • Runtime resources are allocated/managed autonomously. File Cache Service Request Network Input Network Output
…and what it isn’t • Aspen is NOT an auto-parallelizing compiler • Aspen is NOT a library • Aspen is NOT MPI • Autonomous thread and resource management; much higher level • Aspen is NOT a stream language • Autonomous thread management; use of hybrid event-driven/threaded execution model; designed for server applications on general-purpose processors;
Aspen Program Components • Aspen programs consist of modules • Modules are analogous to classes in OO languages • Contain declarations and methods/functions • Instancesof modules are nodes on an Aspen control flow graph • Instances of modules may be implicitly parallel • Root modules specify the parallel structure of a program and action modules actually communicate and specify computation
Communication among Aspen components • Modules are connected to each other via Queues • Queues correspond to flows (or edges) in the graph representing an Aspen program • All communication is via dequeue and send operations on queues
d w[0] w[1] r b … w[n-1] Root modules and expressing parallelism in Aspen • The root module defines the graph that describes communication and parallelism among modules Module Main is Root requires Module Broadcast, Module Worker{ void initialize() { Director d; Worker w[n]; Broadcast b; Reduce r; flow: d|||b|||w|||r; } } . . .
d w[0] w[1] r b … w[n-1] Action modules and doing work in Aspen programs Module Director{ void run(){ int data; QueueElement qsend; … qsend = QueueElement(data); send qsend; … } }
d w[0] w[1] r b … w[n-1] Aspen encapsulates global communication structure Programmer of a module worries about function of the module, not global communication issues Module Worker{ void run(){ QueueElement qrecv, qsend; … qrecv = dequeue(); … qsend = QueueElement(data); send qsend; } }
Sequential semantics within a method • All data used by the method is either obtained • from a parameter • using a dequeue • via I/O • No surprises, no need to know the communication structure • Improves reuse Module Worker{ void run(){ QueueElement qrecv, qsend; … qrecv = dequeue(); … qsend = QueueElement(data); send qsend; } }
Aspen parallelism across flows • For network service applications, flows indicate independent sources of work • Aspen parallelizes across flows by performing a module’s work for different flows in parallel • Aspen runtime manages resources • Threads are adaptively allocated to modules • Connections are a managed resource, and sockets are garbage collected
Aspen Benefits • Better code reusability • Kernels are not communication structure dependent • Points of interaction with external code is well defined • Action module programmers are sequential programmers, with a different interface • Complex parallelism issues are restricted to the root module • But what about performance?
Experimental Methodology • Web Server • Dynamic: SPECWeb2005 • Aspen (with external PHP + FastCGI) • Apache (with mod_php) • Apache (with external PHP + FastCGI) • Static: SPECWeb99-like • Aspen • Flux • Haboob • Flash • Video-On-Demand (“VoD”) server (disk-intensive) • Aspen • Hand-coded server using single-thread-per-request • Effectiveness of Adaptive Thread Allocation • Language usability: Lines of Code (non-whitespace, non-comment) • Micro-benchmarks for collective operations on shared memory and distributed memory machines
SPECWeb2005 Aspen serves approx 15% more simultaneous sessions
SPECWeb99-like Aspen performs 18% better at max load and is more scalable
“VoD” (iPod quality: 1 Mbps) Adaptive Thread Allocation allows Aspen to serve > 1300 clients simultaneously. Single-thread-per-connection cannot scale because of OS scheduler limitations.
“VoD” (DVD quality: 6 Mbps) Aspen performs within 5% of hand-coded server.
Effectiveness of Adaptive Thread Allocation(SPECWeb2005) Number of threads tracks load without any spikes
Aspen collective communication: Ping-Pong latency Shared memory Sun with Opterons Distributed memory Intel Core Duo’s
Aspen collective communication: AllReduce Latency Distributed memory Intel Core Duo’s Shared memory Sun with Opterons
Distributed Aspen:bandwidth Distributed memory Intel Core Duo’s
Conclusions • Aspen facilitates parallel programming! • Provides an infrastructure for expressing applications as networks of independent modules connected by queues • Factors out specification of parallelism from sequential code • Provides primitives for the connection oriented data processing and collective communication • Dynamically manages resources • Provides performance equal to or better than many contemporary solutions • Is simple and easy to code