1 / 36

Scaling Charts with Design and GPUs

Superconductor. Scaling Charts with Design and GPUs. Leo Meyerovich (@ LMeyerov ) CEO of Graphistry.com | UC Berkeley. Visibility. Visibility through design + speed. Histogram of Voter Turnout by Town. # Towns. ballot box stuffing?. M ost towns had ~40% people vote.

zeal
Download Presentation

Scaling Charts with Design and GPUs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Superconductor Scaling Charts with Design and GPUs Leo Meyerovich (@LMeyerov)CEO of Graphistry.com | UC Berkeley

  2. Visibility

  3. Visibility through design+speed

  4. Histogram of Voter Turnout by Town # Towns ballot box stuffing? Most towns had ~40% people vote 0% 25% 50% 75% 100% Voter Turnout

  5. Tiny square shows town size (area) and vote (color) Incumbent Opposition

  6. Filter for towns w/ high turnout

  7. Tag suspicious with black

  8. For visibility, speed design

  9. Problem: Plot 10+ Time Series Signals

  10. Design  200 Time Series Signals 100 s 0 s 0 s

  11. Speed  Pan/Zoom Interactions 38 s 37 s 37 s

  12. CPU Bottlenecks: naïve and offline Render real-time is30ms Layout Transform Parse 0ms 1600ms

  13. Optimize Binary Data, Multicore Layout, GPU Render • Real-time interaction • Stream from server 12MB+/s Render Layout Prep 0ms 1600ms

  14. Graphs: Placing Nodes and Edges

  15. Direct Feedback on Settings

  16. Uber: Trip Start to End

  17. Direct Edge Placement: Overplotting

  18. Speed  Design  Edge Bundling

  19.  web

  20. Bare Metal in the Browser Sequential 5 X 4 lanes SIMD Multicore 4+ cores GPU 1024 lanes

  21. Superconductor: Parallel JS Viz Engine webpage data viz Parser HTML data CSS styling JS script data styling widgets Parser.js Selectors JavaScript VM GPU Layout Selectors.CL Pixels Renderer Layout.CL SUPERCONDUCTOR.js BROWSER Compiler Renderer.GL

  22. Layout as Parallel Tree Traversals logical joins … x,y logical spawns Leaf w,h Parallelism in each traversal! w,h 1.Works for all data sets 2.Compiler: CSS  Schedule w,h w,h w,h w,h

  23. GPU Traversals: Flat & Level-Synchronous y x Array per attribute h w level 1 Nodes in arraysflat level n parallel for loop level synchronous Tree Compiler handles transform of code & data

  24. More Scalable Designs Immens (Stanford) Nanocubes (AT&T) MapD (MIT) Abstract Rendering (Continuum) Synerscope

  25. Achieve data visibility throughhardware-accelerateddesigns (and deploy on the web  )

  26. Graphistry Visualize Magnitudes More Data in the Browser Leo Meyerovich (@LMeyerov)CEO of Graphistry.com | UC Berkeley

  27. Layout as Parallel Tree Traversals logical joins … x,y logical spawns Leaf w,h Parallelism in each traversal! w,h 1.Works for all data sets 2.Compiler: CSS  Schedule w,h w,h w,h w,h

  28. GPU Traversals: Flat & Level-Synchronous y x Array per attribute h w level 1 Nodes in arraysflat level n parallel for loop level synchronous Tree Compiler handles transform of code & data

  29. Today’s Supercomputer-in-a-Pocket Phone 16-lane CPU 1024-lane GPU core 1 1 L1d: 32KB 4-way SIMD 256-way SIMT 4 3 2 GPGPU core 1 Challenge: Parallelize Data Visualization 2 3 Prefetch Engine 4 L2: 1MB RAM: 2GB

  30. Problem: Dynamic Memory Allocation on GPU? function circ(x,y,r) { buffer = new Array(r * 10) for (i = 0; i < r * 10; i++) buffer[i] = cos(i) } circ(…) oval(…) rect(…); … line(…); … 1.0 0.8 0.5 0.2 0 0.2 dynamic allocation square(…) rect(…); …

  31. Dynamic Allocation as SIMD Traversals allocCirc(…) 4 fillCirc(…) allocRect(…) 7 fillRect(…) 1.0 0.8 0.5 0.2 0 0.2 1.0 0.8 0.5 0.2 1.0 0.8 0.5 0.2 0 0.2 allocRect(…) 6 fillRect(…) allocLine(…) 6 fillLine(…) 1. Prefix sum for needed space 3. Distribute offsets & compute 2. Allocate buffers 4. Give OpenGL buffer pointer

  32. CPUvs. GPUfor Election Treemap: 5 traversals over 100K nodes COMBINED: 54X ! WebCL: 70X WebCL: 30X

More Related