1 / 20

Mojtaba Mehrara Po-Chun Hsu Mehrzad Samadi Scott Mahlke

Dynamic Parallelization of JavaScript Applications Using an Ultra-lightweight Speculation Mechanism. Mojtaba Mehrara Po-Chun Hsu Mehrzad Samadi Scott Mahlke Advanced Computer Architecture Laboratory University of Michigan. Web 2.0 is Here. More computation is moved to client-side

ronna
Download Presentation

Mojtaba Mehrara Po-Chun Hsu Mehrzad Samadi Scott Mahlke

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dynamic Parallelization of JavaScript Applications Using an Ultra-lightweight Speculation Mechanism MojtabaMehrara Po-Chun Hsu MehrzadSamadi Scott Mahlke Advanced Computer Architecture Laboratory University of Michigan

  2. Web 2.0 is Here • More computation is moved to client-side • More responsive browsing • Avoid unnecessary network traffic Server-side computation JavaScript + DHTML Web 2.0 Web 1.0 Static HTML Client-side computation Client-side rendering [Vikram et al., CCS’09] 2

  3. Client-side Computation in JavaScript • Flexibility, ease of prototyping, and portability • Poor performance is one of the main challenges 3

  4. Client-side Applications • Interaction-intensive: • Largely composed of event handlers, triggered by user • Examples are Gmail, Facebook, etc. • Compute-intensive: • Dominated by loops and hot functions • Online image editing such as Adobe’s Photoshop.com, Google’sPicnik • Lot more potential: • Online games • Video editing • Sound editing and voice recognition 4

  5. Improving JavaScript Performance • Many efforts underway by browser developers to improve JavaScript sequential performance • Hits a performance wall as multi-cores are becoming dominant • Parallelism must be exploited to make use of multi-core clients • JavaScript is inherently sequential • Language and run-time system provide little/no concurrency support Our proposal: Low-cost dynamic & speculative parallelization of JavaScript applications 5

  6. JavaScript Parallelization • A typical static parallelization flow • Dynamic parallelization $ Memory dependence analysis $ Source code Parallel code generation Memory profiling Data flow analysis Compile time $ Speculation engine (Software transactional memory) Parallel execution Runtime Runtime 6

  7. Our Approach: ParaScript • Light-weight dynamic analysis & code generation for speculative DOALL loops • Low-cost customized SW speculation with a single checkpoint Finish Parallel execution Parallel Code generation Hot loop detection Initial parallelizability assessment Customized speculation Abort Loop Selection Sequential execution Runtime 7

  8. Dependence Analysis • JIT compilation time data flow analysis Runtime initial tests + range-based monitoring Runtime reference-counting-based monitoring 8

  9. Scalar Array Conflict Detection • Initial assessment catches trivial conflicts • Keep track of max and min accessed element indices • Cross-check RD/WR sets after thread execution Thread 1 Thread 2 A[0] = … • A[5] = … B[7] = … • A[6] = A[5]+1 Array read-set Array write-set Array write-set ptr min max ptr min max ptr min max &A 5 5 &A 6 6 &A 5 0 &B 7 7 9

  10. Object Array Conflict Detection • More involved than scalar arrays • Different indices of the same array may point to the same object If dependent based on data-flow analysis A ptrRefCnt ptrRefCnt 2 1 1 &A &A header header myObj0 myObj1 &B 1 B 10

  11. Loop Selection • Focus on DOALL-counted (e.g. for loops) • Avoid parallelizing loops with: • Browser interactions • HTTP request functions • Runtime code insertion Requires locks on browser internals Requires server-side speculation • varaddFunction = • new Function("a" , "b“, • "return a+b;"); Function addFunction(a, b){ return a+b; } a = 7; b = 13; document.write(a+b); eval("a = 7; b = 13; document.write(a+b);"); 11

  12. Checkpointing Mechanism • Go through all references, clone them, and ask GC not to touch the clones • Monitor overhead, back out if more expensive than a threshold 12

  13. Checkpointing Optimizations • Selective variable cloning • Only clone a variable if it is touched during speculative execution • Array clone elimination • Large arrays holding results of browser functions • Instead of cloning the array, just call the function again for recovery • E.g. getImageData in the canvas HTML5 element 13

  14. Parallel Code Generation -Take checkpoint -Spawn threads IE = min(IS+CS*SS,n); Parallel Loop for (i=IS;i<IE;i+=SS) // original loop code conflictcheck(); chunkbarrier() IS+=CS * TC * SS; Loop Barrier Reduction variable & conditional live-out aggregation 14

  15. Experimental Setup • Implemented in Firefox 3.7a1pre • Subset of SunSpider benchmark suite • Others identified as not parallelizable early on, causing 2% slow-down due to the initial analysis. • A set of Pixastic Image Processing filters • 8-processor system -- 2 Intel Xeon Quad-cores, running Ubuntu 9.10 • Ran each benchmark 10 times and took the average 15

  16. Parallelism Coverage High fraction of sequential execution in the getImageData() browser function DOALL loop that extracts pixel RGB & alpha values 16

  17. SunSpider A long iteration dominates execution 17

  18. 8 threads 4 threads Pixastic Image Processing 2 threads High memory op to computation ratio 1 thread 18

  19. Conclusion • Web applications dominance is pushing JavaScript to the forefront of computing • Dynamic environment and performance constraints makes parallelization challenging • We introduce efficient solutions for exploiting parallelism • 17% speculation overhead across all benchmarks • ParaScript achieved an average of 2.55x and 1.82x speedup on 8 processors for SunSpider and Pixastic 19

  20. Thank You! Questions? 20

More Related