280 likes | 398 Views
River Trail: Adding Data Parallelism to JavaScript*. Stephan Herhut, Richard L. Hudson , Tatiana Shpeisman, Jaswanth Sreeram QCon NYC- June. 19, 2012 14:00. JavaScript * – What You Need To Know. It is not Java * Blend of many programming paradigms Object oriented with prototypes
E N D
River Trail: Adding Data Parallelismto JavaScript* Stephan Herhut, Richard L. Hudson , Tatiana Shpeisman, Jaswanth Sreeram QCon NYC- June. 19, 2012 14:00
JavaScript* – What You Need To Know • It is not Java* • Blend of many programming paradigms • Object oriented with prototypes • Higher-order functions and first class function objects • Dynamically typed and interpreted • Safety and security built in • Requirement for web programming • Managed runtime • No pointers, no overflows, … • Designed for portability • Fully abstracts hardware capabilities • No byte-codes, no dusty decks
Concurrency in JavaScript* • Cooperative multi-tasking • Scripts compete with the browser for computing resources • Event driven execution model • Concurrent programming mindset • Asynchronous call-backs for latency hiding • Fully deterministic • Run-to-completion semantics • No concurrent side effects, no race conditions • No support for concurrent execution • Single threaded evaluation of JavaScript
Language Design with the Web in Mind • Ease of use • Build on developer’s existing knowledge • Allow for mash-up of sequential and parallel code • Platform independent • Support all kinds of platforms, parallel or not • Perform well on different parallel architectures (multi-core, GPUs, …) • Suitable for the Open Web • Meet existing safety and security promises • Needs to be reasonably easy to implement in JavaScript JIT engines Challenge: meet these criteria and get good performance
Design Choices • Performance portability • Use High-Level Parallel Patterns • Deterministic execution model • No side effects: shared state is immutable • Require commutative and associative operators • No magic: floating point anomalies may still occur • Support mash-up coding • All code still written purely in JavaScript • Looks like JavaScript*, behaves like JavaScript* • Maintain JavaScript*’s Safety and Security • Use fully managed runtime
River Trail API 3 Pillars: • ParallelArray • Methods • Kernel
ParallelArray • Basic data type for parallel computation • Created from • A JavaScript array • Canvas • Comprehension • Immutable • Dense • Homogenous • Single or multiple dimensions
ParallelArray Methods • Provide the basic skeletons for parallel computing • Typically creates a freshly minted ParallelArray • Combine, Reduce, Scan, Scatter, Filter, Map • Plus a constructor and accessor • Others can be built on top of the above • Sum, Max, Add, Gather, Histogram, etc. • Do Few Things Well
Kernel Function • Methods take kernel function as an argument • Written purely in JavaScript, side effect free • combine and filter arguments • index and array • get can use the index regardless of depth (dimensionality) • reduce, scan • 2 values passed in 1 returned • scatter conflict arguments • Array of target indices, conflict function for collisions • map • Value passed as argument
vari; var a = new Array (...); var b = new Array(a.length); for(i=0;i<a.length;i++){ b[i] = a[i] + 1; } Add 1 to Every Element in A Sequential Data parallel var a = new ParallelArray(...); var b = a.map( function(val){return val+1;} );
vari; var a = new Array (...); var sum = 0; for (i=0; i<a.length; i++) { sum += a[i]; } Sum Reduce-Style Sequential Data parallel var sum = pa.reduce( (a, b) => a+b ); • Data Parallelism is Beautiful More complex example in backup slides if we have time…. varpa = new ParallelArray(...); var sum = pa.reduce( function (a, b) { return a + b; } );
An Example: Grayscale Conversion • pixelData.map(toGrayScale) • .map(function toRGBA(color) { • return [color,color,color,255]; • } • ) toGrayScale – Given a pixel return the gray value}
Compiling River Trail (Prototype) JavaScript Engine • Type inference • Infers array types and shapes • Checks for side effects • Representation analysis • Computes bounds on local variables • Updates type information of known Integer numbers • Static memory allocation • Bounds check elimination • Code generation • Emits OpenCL code Script River Trail Compiler
Compiling River Trail (Prototype) Hardware JavaScript Engine OpenCL Runtime Script OpenCL Kernel multi-core CPUs SIMD instructions River Trail Compiler GPU
Particlemodel (O(n2)) computed using River Trail on a 2nd Generation Core i7 with 4 cores Performance Results: Particle Physics http://github.com/RiverTrail/RiverTrail/wiki
Performance Results: Matrix Matrix Multiply O(n3) dense matrix matrix multiplication on 1000 x 1000 element matrices; dual-core 2nd Generation Core i5 with HyperThreadingenabled and 4GB RAM; JavaScript* benchmarks use Firefox 8
Status Quo • Open source Firefox prototype available on GitHub • Pre-built binary extension for Firefox 12 • Sequential library fall back for other browsers • ECMAScript proposal of the full API published • Removes many limitations of the prototype • First sequential implementation for SpiderMonkey • Lives in Mozilla’s IonMonkey branch • Intended as API testing vehicle http://github.com/RiverTrail/RiverTrail/wiki http://wiki.ecmascript.org/doku.php?id=strawman:data_parallelism
The other routes… Web Workers *
What About Web Workers? Good for task parallelism Implement actors model • No shared state • Communication using messages Heavy weight • Typically implemented using OS threads • Marshaling / Unmarshaling uses JSON (think strings)
What about WebCL • JavaScript binding for OpenCL • Provides HPC parallelism on CPU & GPGPU • Portable and efficient access to heterogeneous devices • WebCL stays close to the OpenCL standard • Preserves OpenCL familiarity to facilitates adoption • Allows developers to translate OpenCL knowledge to web • Easier to keep OpenCL and WebCL in sync, as two evolve • An interface just above OpenCL • Higher level abstractions built on top of WebCL • Intended for performance programmers • Useful HW abstraction • Allows ultimate control, performance, and access to HW
Challenges • WebCL / OpenCL challenges • OpenCL standard leaves things undefined • For example out of bounds • OpenCL makes these the programmer’s responsibility • Not a reasonable approach for web Shared challenge –context management • GPUs do a poor job • Creates Denial of Service and Performance hazards • River Trail can fall back to JavaScript library or OpenCL CPU execution • Currently River Trail is focused on CPU
River Trail WebCL • Gently extends C99 • Retrofits Web Security • Standardization by Khronos • Bifurcated JavaScript / OpenCL-C99 programming model, multiple tool chains • OpenCL (non) deterministic model • Apps: visual computing, physics simulation, games, augmented reality • Gently extended JavaScript • Preserves Web Security • Standardization by ECMA TC39 • Unified high level JavaScript programming model and tool chain • Execution determinism maintained • Apps: visual computing, physics simulation, games, augmented reality
Dealing with boundary conditions • var pa = new ParallelArray( • new Int32Array([2,4,8,16,32])); • function blur(ind){return (this[i]+this[i-1])/2;}; • pa.combine(blur); -> throws error on this[-1] • function halo(boundary, work) { • return function (indx){ • if (indx < boundary) { return this[i];} • else { return work.apply(this, indx);}; • }; • }; • pa.combine(halo(3, blur)))->[2 4 8 12 24] • pa.combine(halo(1, blur)))->[2 3 6 12 24] • pa.combine(halo(pa.length-1, blur))) ->[2 4 8 16 24]