450 likes | 466 Views
Seeds Framework is a parallel grid application framework with key features like pattern-programming and a Java user interface. It self-deploys on computers, clusters, and geographically distributed computers.
E N D
Seeds Framework B. Wilkinson/Clayton Ferner SeedsFramework.ppt Modification date August 15 2014
“Seeds” Parallel Grid Application Framework • Some Key Features • Pattern-programming • Java user interface • (C++ version developed) • Self-deploys on computers, clusters, and geographically distributed computers. • Three development layers, basic, advanced and expert, exposing increasing detail. • We will use the basic level. http://coit-grid01.uncc.edu/seeds/
Seeds programming Slaves Workpool Several standard patterns implemented including Workpool, Pipeline, All-to-all, Stencil, etc. Workpool Three phases: • Master diffuses data to slaves • Slaves performs computations • Master gathers results for slaves Programmer specifies what master and slave do, and what is transferred between them, without implementing low level message passing routines. Master Diffuse Slaves Compute Gather Master Message passing done by Seeds
User Program “Module” class Two classes: “Module” class – diffuse, compute and gather methods and any other methods associated with application Run module “Bootstrap” class - creates an instance of the module class and starts the framework and executes module pattern. Diffuse Compute Gather Run module Bootstrap class
Seeds WorkpoolDiffuseData, Compute, and GatherData Methods Master GatherData DiffuseData Private variable total (answer) DataMap d Returns d to each slave Data argument data Compute Data argument data DataMap input Slaves DataMap output DiffuseData, Compute and GatherData methods start with a capital letter although method names should not! d created in DiffuseData. output created in Compute
Data and DataMap classes For implementation convenience two classes: • Data class used to pass data between master and slaves (Uses a “segment” number to keep track of packets, see later). • DataMap class inside compute method DataMap is a subclass of Data and so allows casting. DataMap methods • put (String, data) – puts data into DataMap identified by string • get (String) – gets stored data identified by string DataMap extends Java HashMap which implement a Map, see http://doc.java.sun.com/DocWeb/api/java.util.HashMap
Module class segment used by Seeds to keep track of where to put results Data cast into a DataMap By framework public Data DiffuseData (int segment) { DataMap<String, Object> d =new DataMap<String, Object>(); input Data = …. d.put(“name_of_inputdata", inputData); return d; } public Data Compute (Data data) { DataMap<String, Object> input = (DataMap<String,Object>)data; //data produced by DiffuseData() DataMap<String, Object> output = new DataMap<String, Object>(); //output returned to gatherdata inputData = input.get(“name_of_inputdata”); … // computation output.put("name_of _results", results); // to return to GatherData() return output; } public void GatherData (int segment, Data dat) { DataMap<String,Object> out = (DataMap<String,Object>) dat; outdata = out.get (“name_of_results”); result … // aggregate outdata from all the worker nodes. result a private variable } GatherData gives back Data object with a segment number By framework
Question Will a class field modified in the DiffuseData or GatherData methods be updated with the same values as in the Compute method? Answer NO. The two methods are running on different JVMs (and different nodes)
Seeds Implementations • Three Java versions developed: • Full JXTA P2P version intended for a cluster and a fully distributed system (grid system). Requires an Internet connection. • JXTA P2P version not needing an external network but otherwise identical, suitable for testing on a single computer. • Multicore (thread-based) version specifically a single multicore computer. • Multicore version much faster execution on a single computer. Only difference is minor coding changes in bootstrap class.
Bootstrap classJXTA P2P version This code deploys framework and starts execution of pattern package edu.uncc.grid.example.workpool; import java.io.IOException; import net.jxta.pipe.PipeID; import edu.uncc.grid.pgaf.Anchor; import edu.uncc.grid.pgaf.Operand; import edu.uncc.grid.pgaf.Seeds; import edu.uncc.grid.pgaf.p2p.Types; public class RunMonteCarloPiModule { public static void main(String[] args) { try { MyModule pi = new MyModule(); Seeds.start( "/path/to/seeds/seed/folder" , false); PipeID id = Seeds.startPattern(new Operand( (String[])null, new Anchor("hostname", Types.DataFlowRoll.SINK_SOURCE), pi )); System.out.println(id.toString() ); Seeds.waitOnPattern(id); Seeds.stop(); System.out.println( "The result is: " + pi.getPi() ) ; } catch (SecurityException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (Exception e) { e.printStackTrace(); } } } Different patterns have similar code
Bootstrap classMulticore version public class RunMonteCarloPiModule { public static void main(String[] args) { try { MyModule pi=new MyModule(); Thread id = Seeds.startPatternMulticore( new Operand( (String[])null, new Anchor( args[0], Types.DataFlowRole.SINK_SOURCE), pi ),4); id.join(); System.out.println( "The result is: " + pi.getPi() ) ; } catch (SecurityException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (Exception e) { e.printStackTrace(); } } } • Multicore version • Much faster on a multicore platform • Thread based • Bootstrap class does not need to start and stop JXTA P2P. Seeds.start() and Seeds.stop() not needed. Otherwise user code similar.
Measuring Time Can instrument code in the bootstrap class: public class RunMyModule { public static void main (String [] args ) { try{ long start = System.currentTimeMillis(); MyModule m = new MyModule(); Seeds.start(. ); PipeID id = ( … ); Seeds.waitOnPattern(id); Seeds.stop(); long stop = System.currentTimeMillis(); double time = (double) (stop - start) / 1000.0; System.out.println(“Execution time = " + time); } catch (SecurityException e) { … …
Compiling/executing • Can be done on the command line (ant script provided) or through an IDE (Eclipse)
Examples of applications using Workpool Pattern • Computing p by the Monte Carlo method
Monte Carlo Methods A so-called “embarrassingly parallel” computation as it decomposes into obviously independent tasks that can be done in parallel without any task communications during the computation. Monte Carlo methods use random selections. For parallelizing Monte Carlo code, must address best way to generate random numbers in parallel.
Calculate p using the Monte Carlo method Circle formed within a 2 x 2 square. Ratio of area of circle to square given by: Points within square chosen randomly. Score kept of how many points happen to lie within circle. Fraction of points within circle will be , given sufficient number of randomly selected samples.
Typically only one quadrant used. One quadrant can be described by integral: Random pairs of numbers, (xr,yr) generated, each between 0 and 1. Counted as in circle if
Alternative (better) Monte Carlo Method (Not used here) Generate random values of x to compute f(x) Sum values of f(x): where xrare randomly generated values of x between x1 and x2. Monte Carlo method very useful if the function cannot be integrated numerically (maybe having a large number of variables) 3.19
Workpool implementation Slaves Compute Return number of 1000 random points inside arc of circle inside Send by starting seed for random sequence to each slave seed Aggregate answers GatherData DiffuseData Master Compute node Source/sink
Seeds Monte Carlo codeMonteCarloPiModule.java DiffuseData Method (Required to be implemented) public Data DiffuseData (int segment) { DataMap<String, Object> d =new DataMap<String, Object>(); d.put("seed", R.nextLong()); return d; // returns a random seed for each job unit }
Compute Method (Required to be implemented) public Data Compute (Data data) { DataMap<String, Object> input = (DataMap<String,Object>)data; DataMap<String, Object> output = new DataMap<String, Object>(); Long seed = (Long) input.get("seed"); // get random seed Random r = new Random(); r.setSeed(seed); Long inside = 0L; for (inti = 0; i < DoubleDataSize ; i++) { double x = r.nextDouble(); double y = r.nextDouble(); double dist = x * x + y * y; if (dist <= 1.0) { ++inside; } } output.put("inside", inside); // to return to GatherData() return output; }
GatherData Method (Required to be implemented) public void GatherData (int segment, Data dat) { DataMap<String,Object> out = (DataMap<String,Object>) dat; Long inside = (Long) out.get("inside"); total += inside; // aggregate answer from all the worker nodes. }
getDataCount Method(Required to be implemented) public intgetDataCount() { return random_samples; } Set number of data “envelopes” sent from master by DiffuseData to slaves, in this case number of “seeds”. (Number of physical slaves processors might be different.) Initialized in: initializeModule(String[ ] args) { random_samples = 3000; }
Method to compute p result(used in bootstrap module) public double getPi() { // returns value of pi based on all workers double pi = (total / (random_samples * DoubleDataSize)) * 4; return pi; }
public Data Compute (Data data) { // input gets the data produced by DiffuseData() DataMap<String, Object> input = (DataMap<String,Object>)data; DataMap<String, Object> output = new DataMap<String, Object>(); Long seed = (Long) input.get("seed"); // get random seed Random r = new Random(); r.setSeed(seed); Long inside = 0L; for (int i = 0; i < DoubleDataSize ; i++) { double x = r.nextDouble(); double y = r.nextDouble(); double dist = x * x + y * y; if (dist <= 1.0) { ++inside; } } output.put("inside", inside);// store partial answer to return to GatherData() return output; // output will emit partial answers done by this method } public Data DiffuseData (int segment) { DataMap<String, Object> d =new DataMap<String, Object>(); d.put("seed", R.nextLong()); return d; // returns a random seed for each job unit } public void GatherData (int segment, Data dat) { DataMap<String,Object> out = (DataMap<String,Object>) dat; Long inside = (Long) out.get("inside"); total += inside; // aggregate answer from all the worker nodes. } public double getPi() { // returns value of pi based on the job done by all the workers double pi = (total / (random_samples * DoubleDataSize)) * 4; return pi; } public int getDataCount() { return random_samples; } } Complete module class MonteCarloPiModule package edu.uncc.grid.example.workpool; import java.util.Random; import java.util.logging.Level; import edu.uncc.grid.pgaf.datamodules.Data; import edu.uncc.grid.pgaf.datamodules.DataMap; import edu.uncc.grid.pgaf.interfaces.basic.Workpool; import edu.uncc.grid.pgaf.p2p.Node; public class MonteCarloPiModule extends Workpool { private static final long serialVersionUID = 1L; private static final int DoubleDataSize = 1000; double total; int random_samples; Random R; public MonteCarloPiModule() { R = new Random(); } public void initializeModule(String[] args) { total = 0; Node.getLog().setLevel(Level.WARNING); // reduce verbosity for logging random_samples = 3000; // set # of random samples }
Bootstrap class(Multicore version) ... public class RunMonteCarloPiModule { public static void main(String[] args) { try { MonteCarloPiModule pi = new MonteCarloPiModule(); Seeds.start( "/path/to/seeds/seed/folder" , false); PipeID id = Seeds.startPattern(new Operand( (String[])null, new Anchor("hostname", Types.DataFlowRoll.SINK_SOURCE),pi)); System.out.println(id.toString() ); Seeds.waitOnPattern(id); Seeds.stop(); System.out.println( "The result is: " + pi.getPi() ) ; } catch (SecurityException e) { ...
Discussion • Does anyone see a potential flaw in the code (clue: random number generation)
Workpool pattern2. Matrix addition and multiplication Matrix addition and multiplication very easy to parallelize as each result value independent of other result values.
Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as ai,jand elements of B as bi,j, each element of C computed as: Add A B C Easy to parallelize – each processor computes one C element or group of C elements
Workpool Implementation Slave computation Adds one row of A with one row of B to create one row of C (rather than each slave adding single elements) Add A B C Note generally we want the Computation/Communication ratio as large as possible. Here it is O(1)!
Workpool implementation Slaves (one for each row) Return one row of C C A B Send one row of A and B to slave Master Compute node Source/sink Following example 3 x 3 arrays and 3 slaves
MatrixAddModule.javaContinues on several sides package edu.uncc.grid.example.workpool; import … public class MatrixAddModule extends Workpool { private static final long serialVersionUID = 1L; int[][] matrixA; int[][] matrixB; int[][] matrixC; public MatrixAddModule() { matrixC = new int[3][3]; } public void initMatrices(){ matrixA = new int[][]{{2,5,8},{3,4,9},{1,5,2}}; matrixB = new int[][]{{2,5,8},{3,4,9},{1,5,2}}; } public intgetDataCount() { return 3; } public void initializeModule(String[] args) { Node.getLog().setLevel(Level.WARNING); } In this example matrices are 3 x 3 Some initial values Required method. Number of data objects (Slaves)
DiffuseData method DataMap d returned are pairs of string key and associated array public Data DiffuseData(int segment) { int[] rowA = new int[3]; int[] rowB = new int[3]; DataMap<String, int[]> d =new DataMap<String, int[]>(); int k = segment; for (inti=0;i<3;i++) { rowA[i] = matrixA[k][i]; rowB[i] = matrixB[k][i]; } d.put("rowA",rowA); d.put("rowB",rowB); return d; } segment variable used to select rows Copy one row of A and one row of B into rowA, rowB to be sent to slaves rowA and rowB put in d DataMap to send to slaves
Compute method public Data Compute(Data data) { int[] rowC = new int[3]; DataMap<String, int[]> input = (DataMap<String,int[]>)data; DataMap<String, int[]> output = new DataMap<String, int[]>(); int[] rowA = (int[]) input.get("rowA"); int[] rowB = (int[]) input.get("rowB"); for (inti=0;i<3;i++) { rowC[i] = rowA[i] + rowB[i]; } output.put("rowC",rowC); return output; } Get two rows from data received Add rows Put result row into output with key to be sent back to master
GatherData method Note segment variable and Data from slave public void GatherData(int segment, Data dat) { DataMap<String,int[]> out = (DataMap<String,int[]>) dat; int[] rowC = (int[]) out.get("rowC"); for (inti=0;i<3;i++) { matrixC[segment][i]= rowC[i]; } } Get C row sent from slave Place row into result matrix Segment variable associated with Data used to choose correct row
Bootstrap classMulticore version public class RunMonteCarloPiModule { public static void main(String[] args) { try { long start = System.currentTimeMillis(); MatrixAddModule m = new MatrixAddModule(); m.initMatrices(); Thread id = Seeds.startPatternMulticore( new Operand( (String[])null, new Anchor( args[0], Types.DataFlowRole.SINK_SOURCE),pi ),4); id.join(); long stop = System.currentTimeMillis(); double time = (double) (stop - start) / 1000.0; System.out.println("Execution time = " + time); m.printResult(); } catch …
Matrix Multiplication Sequential code to compute A x B square (n x n matrices) for (i = 0; i < n; i++) // for each row of A for (j = 0; j < n; j++) { // for each column of B c[i][j] = 0; for (k = 0; k < n; k++) c[i][j] = c[i][j] + a[i][k] * b[k][j]; } Requires n3 multiplications and n3 additions. Sequential time complexity of O(n3).Very easy to parallelize as each result independent
Workpool implementation With one slave computing one element of result: Slaves (one for each element of result) Return one element of C C A Send one row of A and one column of B to slave B Compute node Source/sink Master Following example 3 x 3 arrays and 9 slaves
MatrixAddModule.javaContinues on several sides package edu.uncc.grid.example.workpool; import … public class MatrixAddModule extends Workpool { private static final long serialVersionUID = 1L; int[][] matrixA; int[][] matrixB; int[][] matrixC; public MatrixAddModule() { matrixC = new int[3][3]; } public void initMatrices(){ matrixA = new int[][]{{2,5,8},{3,4,9},{1,5,2}}; matrixB = new int[][]{{2,5,8},{3,4,9},{1,5,2}}; } public intgetDataCount() { return 9; } public void initializeModule(String[] args) { Node.getLog().setLevel(Level.WARNING); } In this example matrices are 3 x 3 Some initial values Required method. Number of data objects (Slaves)
DiffuseData method DataMap d returned are pairs of string key and associated array public Data DiffuseData(int segment) { int[] rowA = new int[3]; int[] colB = new int[3]; DataMap<String, int[]> d =new DataMap<String, int[]>(); int a=segment/3,b = segment%3 ; for (inti=0;i<3;i++) { rowA[i] = matrixA[a][i]; colB[i] = matrixB[i][b]; } d.put("rowA",rowA); d.put(“colB",colB); return d; } segment variable used to select element in A and B Copy one row of A and one column of B into rowA, colB to be sent to slaves rowA and colB put in d DataMap to send to slaves
Note on mapping rows and columns to segments Arow Bcol segment 0 0 0 segment 1 0 1 segment 2 0 2 segment 3 1 0 segment 4 1 1 segment 5 1 2 segment 6 2 0 segment 7 2 1 segment 8 2 2 int Arow =segment/3; Int Bcol = segment%3;
Compute method public Data Compute(Data data) { int[] rowC = new int[3]; DataMap<String, int[]> input = (DataMap<String,int[]>)data; DataMap<String, Integer> output = new DataMap<String, Integer>(); int[] rowA = (int[]) input.get("rowA"); int[] colB = (int[]) input.get(“colB"); int out = 0; for (inti=0;i<3;i++) { out += rowA[i]*colB[i]; } output.put(“out",out); return output; } Get two rows from data received Matrix multiplication, one result Put result into output with key to be sent back to master
GatherData method Note segment variable and Data from slave public void GatherData(int segment, Data dat) { DataMap<String,Integer> out = (DataMap<String,Integer>) dat; int answer = out.get("out"); int a=segment/3, b=segment%3; matrixC[a][b]= answer; } Get result sent from slave* Place element into result matrix Segment variable associated with Data used to choose correct row * Cast from Integer to int not necessary
WorkpoolNumerical integration F(x) Start End x Slaves (one for each partition) Return computed area under curve Area Start Send start and end for partition to slave End Compute node Source/sink Master