210 likes | 340 Views
Volume rendering of 4608^3 combustion data set. Image credit: Mark Howison. Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010. Volume rendering of flame data set using VisIt + IceT on Longhorn. Image credit: Tom Fogal. My perception of my role in Longhorn/XD.
E N D
Volume rendering of 4608^3 combustion data set Image credit: Mark Howison Hank’s Activities Longhorn/XD AHM Austin, TX December 20, 2010 Volume rendering of flame data set using VisIt + IceT on Longhorn. Image credit: Tom Fogal
My perception of my role in Longhorn/XD • Help users succeed via: • Direct support • Ensuring necessary algorithms/functionality are in place • Research most effective way to utilize Longhorn • Also help test machine through aggressive usage • Collaborate with / facilitate for other project members • Provide visibility for center externally (outreach, etc)
Outline • Researching how to best use Longhorn • HW-accelerated volume rendering on Longhorn • SW-ray casting on Longhorn • Collaborations • Manta/VisIt • VDF/VisIt • User support • Analysis of 4K^3 turbulent data • Connected components algorithms • Other user support • Outreach
HW-accelerated volume rendering on longhorn • “Large Data Visualization on Distributed Memory Multi-GPU Clusters”, HPG2010 • Authors: Fogal, Childs, Shankar, Krueger, Bergeron, and Hatcher • Ran VisIt + IceT on Longhorn, varying data size and number of GPUs. • Stage data on CPU, transfer to GPU (high transfer time, but can look at bigger data sets) Volume rendering of flame data set using VisIt + IceT on Longhorn. Image credit: Tom Fogal
HW-accelerated volume rendering on longhorn • Observation about CPU volume rendering: • Idea: GPU volume rendering has the computational horsepower to do ray evaluation quickly, but will have many fewer MPI participants. Paper purpose: study the performance characteristics of GPU volume rendering at high concurrency on big data.
Lots of GPUs Fast-ish on small data Big data
Volume rendering of combustion data set Software ray-casting • Previous work (not XD-related): • “MPI-Hybrid Parallelism for Volume Rendering on Large Multi-Core Systems”, EGPGV 2010 • Authors: Howison, Bethel, and Childs • Strong scaling study up 216,000 cores on ORNL Jaguar machine looking at 4608^3 data. • Study outcome: hybrid parallelism benefits this algorithm, primarily during the compositing phase, since there are less participants in MPI communication. • One of two EGPGV best paper winners, invited for follow on article to TVCG. Image credit: Mark Howison
Software ray-casting • TVCG article (unpublished research): • Add • weak scaling study (up to 22K^3) on Jaguar • GPU scaling study on Longhorn • GPU scaling study: • Went up to 448 GPUs • Purpose: similar to Fogal work, but with a different spin … show that hybrid parallelism is beneficial. Instead of pthreads or OpenMP on the CPU, we are now using CUDA on the GPU.
Scaling results on GPU 2308^3 data 2308^3- 4608^3 data
Software ray-casting on Longhorn Takeaway: for this algorithm and this data size, longhorn is as powerful as 46K processors of jaguar. • Two caveats: • We didn’t optimize for CUDA. So we could have had favorable numbers to an even higher concurrency level. • But Jaguar @ 46K processors has more memory and can look at way bigger data sets.
Manta/VisIt • Carson Brownlee delivers integration of VisIt and Manta via vtkManta objects. • Hank does some small work: • Updates work from VisIt 2.0 to VisIt 2.2 & makes a branch for Hank and Carson to put fixes on. • Testing • Carson and Hank create a list of issues and are in the process of tracking them down. Rendering of isosurface by VisIt using Manta
Visualizing and Analyzing Large-Scale Turbulent Flow • Detect, track, classify, and visualize features in large-scale turbulent flow. • Analysis effort by Kelly Gaither (TACC), Hank Childs (LBNL), & Cyrus Harrison (LLNL). • Stresses two algorithms that are difficult in a distributed memory parallel setting: • Can we identify connected components? • Can we characterize their shape? VisIt calculated connected components on a 4K^3 turbulence data in parallel using TACC's Longhorn machine. 2 million components were initially identified and then the map expression was used to select only the components that had total volume greater than 15. Data courtesy of P.K. Yeung & and Diego Donzis
Identifying connected components in parallel is difficult. • Hard to do efficiently • Tremendous bookkeeping problem. • 4 stage algorithm that finds local connectivity and then merges globally. Participating in 2011 EGPGV submission describing this algorithm and its performance. Authors: Harrison, Childs, Gaither
P2 P1 P3 P0 We used shape characterization to assist our feature tracking. Line Scan Filter • Shape characterization metric: chord length distribution • Difficult to perform efficiently in a distributed memory setting 1) Choose Lines 2) Calculate Intersections 3) Segment redistribution Line Scan Analysis Sink 4) Analyze lines It is our hope that chord length distributions, a characteristic function, can assist in tracking component behavior over time. 5) Collect results
My role in this effort • Easily summarized: “use VisIt to get results to Kelly” • Several iterations: • Started with just statistics of components • Looked at how variation in isovalue affected statistics • Added in chord length distributions as a characteristic function • Took still images of each component for visual inspection • (recently) extracted each component as its own surface for combined inspection.
VDF/VisIt • John Clyne and Dan Lagreca add VDF reader to VisIt. • Hank performs some testing and debugging. • Still lots to do: • Formal commit to VisIt repo. Also add in new VisIt multi-res hooks. • Study how well large features are preserved across refinement level. • Use coarsest versions in conjunction with analysis code from Janine Bennett.
Other user support • Small amount of effort helping Saju Varghese and Kentaro Nagmine of UNLV • Fixed VisIt bug with ray-casting + point meshes • Helped them format their data into BOV format
Outreach & Service • VisIt tutorials: • SC10 (beginning and advanced), Nov 2010, NOLA • Users at US ARL, Sep 2010, Abderdeen, MD • SciDAC 2010, July 2010, Chattanooga, TN • Speaker at NSF Extreme Scale I/O and Data Analysis Workshop, March 2010, Austin, TX • Participant in NSF Workshop on SW Development Environments, Sep 2010, Washington DC • Given ~10 additional talks at various venues this year
Proposed Future Plans • Continue collaboration with Kelly on analyzing turbulent flow • Formally integrate VDF • Multi-res study with John & Kelly • Would like to do 1T cell runs on Longhorn • Continued user support • Esp. CIG • Connected components @ EGPGV • VisIt + GPU Two trillion cell data set, rendered in VisIt by David Pugmire on ORNL Jaguar machine
Summary • Researching how to best use Longhorn • HW-accelerated volume rendering on Longhorn • SW-ray casting on Longhorn • Collaborating with other Longhorn/XD members • Manta/VisIt • VDF/VisIt • Doing user support • Helping Kelly analyze 4K^3 turbulent data • Working to make sure connected components algorithms is up to snuff • Some user support and more to come… • Performing outreach activities