Integration of Reverse Monte-Carlo Ray Tracing within Uintah

Integration of Reverse Monte-Carlo Ray Tracing within Uintah Todd Harman Department of Mechanical Engineering Jeremy Thornock Department of Chemical Engineering Isaac Hunsaker Graduate Student Department of Chemical Engineering

Deliverables • Year 2: Demonstration of a fully-coupled problem using RMCRT within ARCHES. Scalability demonstration.

Approach CFD: Finest level, (always) RMCRT: 1 Level: CFD 2 Level: coarsest level “Data Onion”: finest level, Research Topic: Region of Interest (ROI)

2 Levels 2 Levels

Data Onion 3-Levels

Data Onion: ROI Implemented Research Topic: ROI location? Static: • User defined region?

Data Onion: ROI Implemented Research Topic: ROI location Dynamic: • ROI computed every timestep? (abskg sigmaT4) • ROI proportional to the size of fine level patches?

Status: Completed • 80% Complete: Data Onion, dynamic & static region of interests. Testing phase, need benchmarks. • 90% Complete: Integration of RMCRT tasks within ARCHES (2 level)

Status: Work in Progress • Single Level Verification Order of accuracy # rays (old) grid resolution Scalability studies, new mixed scheduler. • 2 Levels verification Errors associated with coarsening

Benchmark Problem Initial Conditions: - Uniform temperature field - Analytical function for absorption coefficient S. P. Burns and M.A Christon. Spatial domain-based parallelism in large-scale, participating-media, radiative transport applications. Numerical Heat Transfer, Part B, 31(4):401-421, 1997.

Verification: 1L S. P. Burns and M.A Christon. Spatial domain-based parallelism in large-scale, participating-media, radiative transport applications. Numerical Heat Transfer, Part B, 31(4):401-421, 1997.

Verification: 1L

Verification: 2L 4X error from coarsening abskg

Verification: 2L Error Coarsening: smoothing filter Abskg

Collaboration Leverage the work of Dr. Berzin’s team Hybrid MPI-threaded Task Scheduler (Qingyu Meng) GPU-RMCRT (Alan Humphrey)

Hybrid MPI-threaded Task Scheduler • Hybrid MPI-threaded Task Scheduler*: • Memory reduction! • 13.5Gb -> 1GBper node (12 cores/node)*. • (2 material CFD problem, 20483 cells, on 110592 cores of Jaguar) • Interconnect drivers and MPI software must be threadsafe. • RMCRT requires anMPI environmental variable expert! *Q. Meng, M. Berzins, and J. Schmidt, Using hybrid parallelism to improve memory use in uintah. In Proceeding of the Teragrid 2011.

MPI-threaded Task Scheduler Kraken 100 rays per cell

MPI-threaded Task Scheduler Difficult to run on Kraken, crashing in mvapich Further testing needed on bigger machines?

GPU-RMCRT Nvidia M2070/90 Tesla GPU Keeneland Initial Delivery System 360 GPUs DoE Titan 1000s of GPUs + Multi-core CPU Motivation - Utilize all available hardware • Uintah’s asynchronous task-based approach is well suited to take advantage of GPUs • RMCRT is ideal for GPUs

GPU-RMCRT • Offload Ray Tracing and RNG to GPU(s)‏ • Available CPU cores can perform other computation. • Uintah infrastructure supports GPU task scheduling and execution: • Can access multiple GPUs on-node • Uses Nvidia CUDA C/C++ • Using NVIDIA cuRAND Library • GPU-accelerated random number generation (RNG)‏

Uintah Hybrid CPU/GPU Scheduler • Create & schedule CPU & GPU tasks • Enables Uintah to “pre-fetch” GPU data • Uintah infrastructure manages: • Queues of CUDA Stream and Event handles • Device memory allocation and transfers • Utilize all available: CPU cores and GPUs

Uintah GPU Scheduler Abilities • Capability jobs run on: Keeneland Initial Delivery System (NICS)‏ 1440 CPU cores & 360 GPUs simultaneously Jaguar - GPU partition (OLCF)‏ 15360 CPU cores & 960 GPUs simultaneousl • Development of GPU RMCRT prototype underway.

Status: Pending • Head-to-head comparison of RMCRT with Discrete Ordinates Method. Single level. Accuracy versus computational cost. • 2 Levels: Coarsening error for variable temperature and radiative properties. • Data Onion: Serial performance Accuracy versus number of levels, refinement ratio, dynamic/static ROI. Scalability Studies

Summary Summary • Order of Accuracy: # rays0.5 , grid Cells1 • Accuracy issues related to coarsening data. • Cost = f( #rays, Grid Cells1.4-1.5 communication….) Doubling the grid resolution = 20ish X increase in cost. • Good scalability characteristics Year 2: Demonstration of a fully-coupled problem using RMCRT within ARCHES. Scalability demonstration.

GPU RMCRT Acknowledgements: DoE for funding the CSAFE project from 1997-2012, DOE NETL, DOE NNSA, INCITE NSF for funding via SDCI and PetaApps Keeneland Computing Facility, supported by NSF under Contract OCI-0910735 Oak Ridge Leadership Computing Facility – DoE Jaguar XK6 System (GPU partition)‏ http://www.uintah.utah.edu

Physics Isotropic scattering added to the model Verification testing performed using an exact solution (Siegel, 1987) Grid convergence analysis performed Discrepancy diminishes with increased mesh refinement

Isotropic Scattering:Verification • Benchmark Case of Seigel 1987 • Cube (1m3) • Uniform Temperature 64.7K • Mirror surface on all sides • Black top and bottom walls • Computed surface fluxes on top & bottom walls • 10 rays per cell (low) Seigel, R. “Transient Radiative Cooling of a droplet-filled layer,” ASME Journal of Heat Transfer,109:159-164, 1987.

Isotropic Scattering:Verification Radiative Flux vs Optical Thickness RMCRT (dots) Exact solution (lines) Seigel, R. “Transient Radiative Cooling of a droplet-filled layer,” ASME Journal of Heat Transfer,109:159-164, 1987.

Isotropic Scattering:Verification Grid convergence of the L1 error norms where the scattering coefficient is 8 m-1, and the absorption coefficient is 2m-1.

DOM vs RMCRT • IFRF burner simulation (production size run) • 1344 processors/cores • Initial conditions taken from a previous run with DOM. • Domain: (1m x 4.11 m x 1m) • Resolution: (4.4mm x 8.8mm x 4.4mm) 24 million cells

DOM vs RMCRT

Integration of Reverse Monte-Carlo Ray Tracing within Uintah

Integration of Reverse Monte-Carlo Ray Tracing within Uintah

Presentation Transcript

Monte Carlo Path Tracing

Monte Carlo

Lesson 8: Basic Monte Carlo integration

Monte Carlo Integration II

Monte Carlo Path Tracing Coursework Notes

Intermolecular Forces and Monte-Carlo Integration

McXtrace - an X-ray Monte Carlo ray-tracing software package

Monte Carlo Integration II

Monte Carlo Integration in Excel

Monte Carlo Integration

Analysis of Monte Carlo Integration Fall 2013

Monte Carlo Integration

Monte Carlo Path Tracing and Caching Illumination

Monte Carlo Integration II

McXtrace - an X-ray Monte Carlo ray-tracing software package

Monte Carlo Integration II