350 likes | 422 Views
CIS 610: Many-core visualization libraries. Hank Childs, University of Oregon. Jan. 21st, 2013. Schedule for this class. We have done 5 lectures in 2 weeks We should have done 4 lectures over last two weeks We will do 3 lectures this week We will be one full week ahead of schedule.
E N D
CIS 610: Many-core visualization libraries Hank Childs, University of Oregon Jan. 21st, 2013
Schedule for this class • We have done 5 lectures in 2 weeks • We should have done 4 lectures over last two weeks • We will do 3 lectures this week • We will be one full week ahead of schedule. • We will cancel two lectures over the coming weeks.
Schedule this week • Tuesday lecture: today • Review of data parallel operations, general discussion of packages so far • Thursday lecture: Ken Moreland • (Thursday colloquium @ 12: Ken Moreland) • Friday lecture: Ken Moreland • 8:30-10 (I can’t make this time) • 11-12:30 • 11:30-1:00
Upcoming schedule • Tuesday, Jan 28th • 10 minute presentation by each student on the project they want to pursue • Non-binding • Discuss the problem, and some initial thoughts about how to do it in many-core libraries
Upcoming schedule • Thursday, Jan 30th • Group session debugging problems. • Important that you have started your project by then.
Upcoming schedule • Weeks following • Series of 20 minute presentations, 3 per lecture • Two flavors of presentation: • “Update on my project” • “Overview of a paper I read”
How this class will be graded • You will all submit a report at the end of the quarter. • The report will include: • A summary of what you have done • It will focus on your project • You should also include • Presentations made • Porting of libraries • Assistance to other students • Bugs debugged (or reported) • Etc…
How this class will be graded • It is not curved • If you all decide to not to present papers, you will all be penalized • I expect you all get very good grades • But it is important that you work hard and accomplish something in this class • Play with the libraries, present papers in class, and really try to nail your “research project”
Lectures • I expect you will all make about 3 presentations • 1 research update, 2 papers • 2 research updates, 1 paper • Some lectures in the short term on CUDA, Thrust, data parallelism, etc, would probably be helpful.
EAVLExtreme-scale Analysis and Visualization Library Jeremy Meredith January, 2014
A Simple Data-Parallel Operation void CellToCellDivide(Field &a, Field &b, Field &c) { for_each(i) c[i] = a[i] / b[i]; } void CalculateDensity(...) { //... CellToCellDivide(mass, volume, density); } Internal Library API Provides This Algorithm Developer Writes This
Functor + Iterator Approach void CalculateDensity(...) { //... CellToCellBinaryOp(mass, volume, density, Divide()); } template <class T>void CellToCellBinaryOp<T>(Field &a, Field &b, Field &c T &f) { for_each(i) f(a[i],b[i],c[i]); } structDivide { void operator()(float &a, float &b, float &c) { c = a / b; } }; Internal Library API Provides This Algorithm Developer Writes This
Custom Functor void CalculateDensity(...) { //... CellToCellBinaryOp(mass, volume, density, MyFunctor()); } template <class T>void CellToCellBinaryOp<T>(Field &a, Field &b, Field &c T &f) { for_each(i) f(a[i],b[i],c[i]); } structMyFunctor { void operator()(float &a, float &b, float &c) { c = a + 2*log(b); } }; Internal Library API Provides This Algorithm Developer Writes These
Map with 1 input, 1 output Simplest data-parallel operation. Each result item can be calculated from its corresponding input item alone. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 result 6 14 0 2 8 0 0 8 10 6 2 0 structf { float operator()(float x) { return x*2; } };
Map with 2 inputs, 1 output With two input arrays, the functor takes two inputs. You can also have multiple outputs. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 y 2 4 2 1 8 3 9 5 5 1 2 1 result 5 11 2 2 12 3 9 9 10 4 3 1 structf { float operator()(float a, floatb) { return a+b; } };
Scatter with 1 input (and thus 1 output) Possibly inefficient, risks of race conditions and uninitialized results. (Can also scatter to larger array if desired.) Often used in a scatter_if–type construct. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 indices 2 4 1 5 5 0 4 2 1 2 1 4 result 0 1 3 0 4 No functor
Gather with 1 input (and thus 1 output) Unlike scatter, no risk of uninitialized data or race condition. Plus, parallelization is over a shorter indices array, and caching helps more, so can be more efficient. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 indices 1 9 6 9 3 result 7 3 0 3 1 No functor
Reduction with 1 input (and thus 1 output) Example: max-reduction. Sum is also common. Often a fat-tree-based implementation. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 result 7 structf { float operator()(float a, floatb) { return a>b ? a : b; } };
Inclusive Prefix Sum (a.k.a. Scan)with 1 input/output Value at result[i] is sum of values x[0]..x[i]. Surprisingly efficient parallel implementation. Basis for many more complex algorithms. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 + + + + + + + + + + + result 3 10 10 11 15 15 15 19 24 27 28 28 No functor.
Exclusive Prefix Sum (a.k.a. Scan)with 1 input/output Initialize with zero, value is sum of only up to x[i-1]. May be more commonly used than inclusive scan. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 + + + + + + + + + + + 0 result 0 3 10 10 11 15 15 15 19 24 27 28 No functor.
Threshold • Keep cell if it meets some criteria, else discard • Criteria: • Pressure > 2 • 10 < temperature < 20 Cells that meet criteria
How to implement threshold • Iterate over cells • If a cell meets the criteria, then place that cell in the output • Output is an unstructured mesh
Example: Thresholding an RGrid (a) • Explicit cells can be combined with structured coordinates. eavlStructuredCellSet eavlExplicitCellSet eavlCoordinates eavlCoordinates eavlField#0 eavlField#1 eavlField#2 eavlField#0 eavlField#1 eavlField#2
Example: Thresholding an RGrid (b) • A second Cell Set can be added which refers to the first one eavlStructuredCellSet eavlSubset eavlStructuredCellSet eavlCoordinates eavlCoordinates eavlField#0 eavlField#1 eavlField#2 eavlField#0 eavlField#1 eavlField#2
Starting Mesh We want to threshold a mesh based on its density values (shown here). 43 47 52 63 32 38 42 49 31 37 41 38 0 1 2 3 4 5 6 7 8 9 10 11 density 43 47 52 63 32 38 42 49 31 37 41 38 43 47 52 63 If we threshold 35 < density < 45, we want this result: 32 38 42 49 31 37 41 38
Which Cells to Include? Evaluate a Map operation with this functor: structInRange { float lo, hi; InRange(floatl, floath) :lo(l), hi(h){ } int operator()(float x) { return x>lo && x<hi; } } 1 0 0 0 0 1 1 0 0 1 1 1 0 1 2 3 4 5 6 7 8 9 10 11 density 43 47 52 63 32 38 42 49 31 37 41 38 InRange() inrange 1 0 0 0 0 1 1 0 0 1 1 1
How Many Cells in Output? Evaluate a Reduce operation using the Add<> functor. We can use this to create output cell length arrays. 1 0 0 0 0 1 1 0 0 1 1 1 0 1 2 3 4 5 6 7 8 9 10 11 inrange 1 0 0 0 0 1 1 0 0 1 1 1 plus result 6
Where Do the Output Cells Go? Input indices Output indices 0 1 2 3 0 4 5 6 7 1 2 8 9 10 11 3 4 5 0 1 2 3 4 5 6 7 8 9 10 11 input cell output cell 0 1 2 3 4 5 How do we create this mapping?
Create Input-to-Output Indexing? Exclusive Scan (exclusive prefix sum) gives us the output index positions. 0 1 2 3 4 5 + + + + + + + + + + + 0 1 2 3 4 5 6 7 8 9 10 11 inrange 1 0 0 0 0 1 1 0 0 1 1 1 0 startidx 0 1 1 1 1 1 2 3 3 3 4 5
Scatter Input Arrays to Output? NO. We can do this, but scatters can be risky/inefficient. Assuming we have multiple arrays to process, we can do something better.... 0 5 6 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 43 47 52 63 32 38 42 49 31 37 41 38 density startidx 0 1 1 1 1 1 2 3 3 3 4 5 Race condition unless we add a mask array! output_density 43 38 42 37 41 38
Create Output-to-Input Indexing? We want to work in the shorter output-length arrays and use gathers. A specialized scatter in EAVL creates this reverse index. 0 5 6 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 startidx 0 1 1 1 1 1 2 3 3 3 4 5 revindex 0 5 6 9 10 11
Gather Input Mesh Arrays to Output? We can now use simple gathers to pull input arrays (density, pressure) into the output mesh. 43 38 42 37 41 38 0 1 2 3 4 5 6 7 8 9 10 11 43 47 52 63 32 38 42 49 31 37 41 38 density revindex 0 5 6 9 10 11 output_density 43 38 42 37 41 38