300 likes | 316 Views
Learn how to make parallel programming more usable and scalable by implementing high-level computational patterns, reducing source code size, and avoiding deadlocks and race conditions. Explore the advantages of pattern programming and its automated conversion into parallel programs. Discover how patterns provide a guide to best practices and improve the design structure of parallel programs.
E N D
Pattern Programming ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson, 2012. Aug 30, 2012 PatternProg-1
Acknowledgment This work was initiated by Jeremy Villalobos and described in his PhD thesis: “RUNNING PARALLEL APPLICATIONS ON A HETEROGENEOUS ENVIRONMENT WITH ACCESSIBLE DEVELOPMENT PRACTICES AND AUTOMATIC SCALABILITY,” UNC-Charlotte, 2011.
Pattern Programming Research Group • 2011 • Jeremy Villalobos (PhD awarded, continuing involvement) • Saurav Bhattara (MS thesis, graduated) • Spring 2012 • Yawo Adibolo (ITCS 6880 Individual Study) • Ayay Ramesh (ITCS 6880 Individual Study) • Fall 2012 • Haoqi Zhao (MS thesis) • Pohua Lee (BS senior project) Openings!
Problem Addressed • To make parallel programming more useable and scalable. • Parallel programming, writing programs for solving problems using multiple computers, processors, and cores, has a very long history but still a challenge. • Traditional approach involve explicitly specifying message-passing (for clusters and distributed computers) and threads (for shared memory) with low-level APIs. • Need a better structured approach.
Pattern Programming Concept Programmer begins by constructing his program using established computational or algorithmic “patterns” that provide a structure. What patterns are we talking about? • Low-level algorithmic patterns that might be embedded into a program such as fork-join, broadcast/scatter/gather. • Higher level algorithm patterns for forming a complete program such as workpool, pipeline, stencil, map-reduce. • We concentrate upon higher-level “computational/algorithm ” level patterns rather than lower level patterns.
Some Patterns Workpool Workers Two-way connection Master Compute node Source/sink Derived from Jeremy Villalobos’s PhD thesis defense
Pipeline Stage 1 Stage 2 Stage 3 Workers One-way connection Two-way connection Master Compute node Source/sink
Divide and Conquer Two-way connection Divide Merge Compute node Source/sink
All-to-All Two-way connection Compute node Source/sink
Usually a synchronous computation - Performs number of iterations to converge on solution e.g. for solving Laplace’s/heat equation Stencil On each iteration, each node communicates with neighbors to get stored computed values Two-way connection Compute node Source/sink
Note on Terminology “Skeletons” Sometimes term “skeleton” used to describe “patterns”, especially directed acyclic graphs with a source, a computation, and a sink.We do not make that distinction and use the term “pattern” whether directed or undirected and whether acyclic or cyclic. This is done elsewhere.
Patterns • Advantages • Possible to create parallel code from the pattern specification automatically – see later. • Abstracts/hides underlying computing environment • Generally avoids deadlocks and race conditions • Reduces source code size (lines of code) • Disadvantages • New approach to learn • Takes away some of the freedom from programmer • Performance reduced (c.f. using high level languages instead of assembly language)
More Advantages/Notes • “Design patterns” part of software engineering for many years • Reusable solutions to commonly occurring problems * • Patterns provide guide to “best practices”, not a final implementation • Provides good scalable design structure to parallel programs • Can reason more easier about programs • Hierarchical designs with patterns embedded into patterns, and pattern operators to combine patterns. • Leads to an automated conversion into parallel programs without need to write with low level message-passing routines such as MPI. * http://en.wikipedia.org/wiki/Design_pattern_(computer_science)
Previous/Existing Work • Patterns/skeletons explored in several projects. • Universities: • University of Illinois at Urbana-Champaign and University of California, Berkeley • University of Torino/Università di Pisa Italy • ... • Industrial efforts • Intel • Microsoft • …
Universal Parallel Computing Research Centers (UPCRC) University of Illinois at Urbana-Champaign and University of California, Berkeley with Microsoft and Intel in 2008 (with combined funding of at least $35 million). Co-developed OPL (Our Pattern Language). Group of twelve computational patterns identified: • Finite State Machines • Circuits • Graph Algorithms • Structured Grid • Dense Matrix • Sparse Matrix in seven general application areas
Intel Focused on very low level patterns such as fork-join, and provides constructs for them in: • Intel Threading Building Blocks (TBB) • Template library for C++ to support parallelism • Intel Cilk plus • Compiler extensions for C/C++ to support parallelism • Intel Array Building Blocks (ArBB) • Pure C++ library-based solution for vector parallelism Above are somewhat competing tools obtained through takeovers of small companies. Each implemented differently.
New book 2012 from Intel authors “Structured Parallel Programming: Patterns for Efficient Computation,” Michael McCool, James Reinders, Arch Robison, Morgan Kaufmann, 2012 Focuses on Intel tools
Using patterns with Microsoft C# http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=19222 Again very low-level with patterns such as parallel for loops.
Closest to our work http://calvados.di.unipi.it/dokuwiki/doku.php?id=ffnamespace:about University of Torino, Italy /Università di Pisa
Our approach(Jeremy Villalobos’ UNC-C PhD thesis) Focuses on a few patterns of wide applicability (e.g. workpool, synchronous all-to-all, pipelined, stencil) but Jeremy took it much further than UPCRC and Intel. He developed a higher-level framework called “Seeds” Uses pattern approach to automatically distribute code across processor cores, computers, or geographical distributed computers and execute the parallel code.
“Seeds” Parallel Grid Application Framework • Some Key Features • Pattern-programming • (Java) user interface • Self-deploys on computers, clusters, and geographically distributed computers • Load balances • Three levels of user interface http://coit-grid01.uncc.edu/seeds/
Seeds Development Layers • Basic • Intended for programmers that have basic parallel computing background • Based on skeletons and patterns • Advanced: Used to add or extend functionality such as: • Create new patterns • Optimize existing patterns or • Adapt existing pattern to non-functional requirements specific to the application • Expert: Used to provide basic services: • Deployment • Security • Communication/Connectivity • Changes in the environment Derived from Jeremy Villalobos’s PhD thesis defense
Deployment • Different ways implemented during PhD work • Deployment with SSH • now preferred
Basic User Programmer Interface • To create and execute parallel programs, programmer selects a pattern and implements three principal Java methods: • Diffuse method – to distribute pieces of data. • Compute method – the actual computation • Gather method – used to gather the results • Programmer also has to fill in details in a “bootstrap” class to deploy and start the framework. Diffuse Compute Gather Bootstrap class The framework self-deploys on a geographically distributed platform and executes pattern.
public Data Compute (Data data) { // input gets the data produced by DiffuseData() DataMap<String, Object> input = (DataMap<String,Object>)data; // output will emit the partial answers done by this method DataMap<String, Object> output = new DataMap<String, Object>(); Long seed = (Long) input.get("seed"); // get random seed Random r = new Random(); r.setSeed(seed); Long inside = 0L; for (int i = 0; i < DoubleDataSize ; i++) { double x = r.nextDouble(); double y = r.nextDouble(); double dist = x * x + y * y; if (dist <= 1.0) { ++inside; } } output.put("inside", inside);// store partial answer to return to GatherData() return output; } public Data DiffuseData (int segment) { DataMap<String, Object> d =new DataMap<String, Object>(); d.put("seed", R.nextLong()); return d; // returns a random seed for each job unit } public void GatherData (int segment, Data dat) { DataMap<String,Object> out = (DataMap<String,Object>) dat; Long inside = (Long) out.get("inside"); total += inside; // aggregate answer from all the worker nodes. } public double getPi() { // returns value of pi based on the job done by all the workers double pi = (total / (random_samples * DoubleDataSize)) * 4; return pi; } public int getDataCount() { return random_samples; } } Complete code (Monte Carlo pi in Assignment 1, see later for more details) Computation package edu.uncc.grid.example.workpool; import java.util.Random; import java.util.logging.Level; import edu.uncc.grid.pgaf.datamodules.Data; import edu.uncc.grid.pgaf.datamodules.DataMap; import edu.uncc.grid.pgaf.interfaces.basic.Workpool; import edu.uncc.grid.pgaf.p2p.Node; public class MonteCarloPiModule extends Workpool { private static final long serialVersionUID = 1L; private static final int DoubleDataSize = 1000; double total; int random_samples; Random R; public MonteCarloPiModule() { R = new Random(); } @Override public void initializeModule(String[] args) { total = 0; Node.getLog().setLevel(Level.WARNING); // reduce verbosity for logging random_samples = 3000; // set number of random samples } Note: No explicit message passing
Bootstrap class This code deploys framework and starts execution of pattern package edu.uncc.grid.example.workpool; import java.io.IOException; import net.jxta.pipe.PipeID; import edu.uncc.grid.pgaf.Anchor; import edu.uncc.grid.pgaf.Operand; import edu.uncc.grid.pgaf.Seeds; import edu.uncc.grid.pgaf.p2p.Types; public class RunMonteCarloPiModule { public static void main(String[] args) { try { MonteCarloPiModule pi = new MonteCarloPiModule(); Seeds.start( "/path/to/seeds/seed/folder" , false); PipeID id = Seeds.startPattern(new Operand( (String[])null, new Anchor( "hostname" , Types.DataFlowRoll.SINK_SOURCE), pi ) ); System.out.println(id.toString() ); Seeds.waitOnPattern(id); System.out.println( "The result is: " + pi.getPi() ) ; Seeds.stop(); } catch (SecurityException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (Exception e) { e.printStackTrace(); } } } Different patterns have similar code
Compiling/executing • Can be done on the command line (ant script provided) or through an IDE (Eclipse)
http://coit-grid01.uncc.edu/seeds/ Tutorial page
Next step • Assignment 1 – using the Seeds framework