360 likes | 374 Views
Explore superconductor-based web data visualization, optimizing speed with design, scaling charts and GPUs for enhanced visibility and performance. Learn about parallel JavaScript visualization engines and Achieve data visibility through hardware-accelerated designs deployed on the web.
E N D
Superconductor Scaling Charts with Design and GPUs Leo Meyerovich (@LMeyerov)CEO of Graphistry.com | UC Berkeley
Visibility through design+speed
Histogram of Voter Turnout by Town # Towns ballot box stuffing? Most towns had ~40% people vote 0% 25% 50% 75% 100% Voter Turnout
Tiny square shows town size (area) and vote (color) Incumbent Opposition
Filter for towns w/ high turnout
For visibility, speed design
Design 200 Time Series Signals 100 s 0 s 0 s
Speed Pan/Zoom Interactions 38 s 37 s 37 s
CPU Bottlenecks: naïve and offline Render real-time is30ms Layout Transform Parse 0ms 1600ms
Optimize Binary Data, Multicore Layout, GPU Render • Real-time interaction • Stream from server 12MB+/s Render Layout Prep 0ms 1600ms
Bare Metal in the Browser Sequential 5 X 4 lanes SIMD Multicore 4+ cores GPU 1024 lanes
Superconductor: Parallel JS Viz Engine webpage data viz Parser HTML data CSS styling JS script data styling widgets Parser.js Selectors JavaScript VM GPU Layout Selectors.CL Pixels Renderer Layout.CL SUPERCONDUCTOR.js BROWSER Compiler Renderer.GL
Layout as Parallel Tree Traversals logical joins … x,y logical spawns Leaf w,h Parallelism in each traversal! w,h 1.Works for all data sets 2.Compiler: CSS Schedule w,h w,h w,h w,h
GPU Traversals: Flat & Level-Synchronous y x Array per attribute h w level 1 Nodes in arraysflat level n parallel for loop level synchronous Tree Compiler handles transform of code & data
More Scalable Designs Immens (Stanford) Nanocubes (AT&T) MapD (MIT) Abstract Rendering (Continuum) Synerscope
Achieve data visibility throughhardware-accelerateddesigns (and deploy on the web )
Graphistry Visualize Magnitudes More Data in the Browser Leo Meyerovich (@LMeyerov)CEO of Graphistry.com | UC Berkeley
Layout as Parallel Tree Traversals logical joins … x,y logical spawns Leaf w,h Parallelism in each traversal! w,h 1.Works for all data sets 2.Compiler: CSS Schedule w,h w,h w,h w,h
GPU Traversals: Flat & Level-Synchronous y x Array per attribute h w level 1 Nodes in arraysflat level n parallel for loop level synchronous Tree Compiler handles transform of code & data
Today’s Supercomputer-in-a-Pocket Phone 16-lane CPU 1024-lane GPU core 1 1 L1d: 32KB 4-way SIMD 256-way SIMT 4 3 2 GPGPU core 1 Challenge: Parallelize Data Visualization 2 3 Prefetch Engine 4 L2: 1MB RAM: 2GB
Problem: Dynamic Memory Allocation on GPU? function circ(x,y,r) { buffer = new Array(r * 10) for (i = 0; i < r * 10; i++) buffer[i] = cos(i) } circ(…) oval(…) rect(…); … line(…); … 1.0 0.8 0.5 0.2 0 0.2 dynamic allocation square(…) rect(…); …
Dynamic Allocation as SIMD Traversals allocCirc(…) 4 fillCirc(…) allocRect(…) 7 fillRect(…) 1.0 0.8 0.5 0.2 0 0.2 1.0 0.8 0.5 0.2 1.0 0.8 0.5 0.2 0 0.2 allocRect(…) 6 fillRect(…) allocLine(…) 6 fillLine(…) 1. Prefix sum for needed space 3. Distribute offsets & compute 2. Allocate buffers 4. Give OpenGL buffer pointer
CPUvs. GPUfor Election Treemap: 5 traversals over 100K nodes COMBINED: 54X ! WebCL: 70X WebCL: 30X