Query Reordering for Photon Mapping

Query Reordering for Photon Mapping Rohit Saboo

Photon Mapping A two step solution for global illumination: Step 1: Build the Photon Map Step 2: Shoot eye rays and perform a “gather”

Gather variants • Approximate • Accurate

Bandwidth estimate • 100 queries for one eye-ray • 20 bytes per photon • 512x512 image • super-sampling give bandwidth estimates of 50GB. Caches fetch data in blocks + other factors -> bandwidth requirement could go upto 200 GB

Reordering Queries Improve locality of data/queries for cache effectiveness Two ways- • Generate the queries in some order • Generate the queries, reorder them in some manner and then run the queries.

R A M P R O C E S S O R I-Cache L2 Cache D-Cache Cache Hierarchy A very naïve hierarchy

Reordering methods • Row ordering • Tiled row ordering • Direction binning • Hashed • Tiled Direction binned hashed • Hilbert Curve • Tiled Hilbert curve

The Cornell Box Not one of the results

Performance monitoring • Intel Pentium M processor 1.7GHz (The frequency scaling feature was disabled) • FSB 533 MHz • 2MB L2 cache • Separate I-cache and D-cache • 32KB 8-way set associative each I-cache and D-cache • 768MB RAM • Windows XP (with most services disabled) • pbrt • Intel C++ compiler with all optimizations • VTune performance analysis package

Results With plain irradiance caching – • Branch mispredictions account for ~25% of the time • Algo seems to be too complicated to be optimized successfully. • Bus utilization factor – 0.0024 (no of times bus was asserted busy vs clockticks) which is very low. • ~10% of time spent due to cache misses.

Results… Naïve reordering… • Bus utilization – 0.0014 – again very low • CPU load port – 0.54 loads per clocktick (maximum I could achieve is 1.07) • ~7% of time wasted due to cache misses.

Results… Hilbert curve … • Bus utilization – 0.00074 (an order of magnitude lower) • 0.93 loads per clocktick (almost as high as one can get) • Not much impact due to L2 cache misses.

Multi threading • Multithreaded the kd-tree data structure • Simply starts two threads to do the search. • Results show very small changes • Maybe some other threading approach would be better? • Cost of threading overshadows any gains.

Any Possible Discrepencies • Pentium M processor vs Desktop processors – results are highly architecture dependent. (eg if processor has more than one port connected to D-cache) • Not running the analysis over the entire duration of the run.

Conclusions • L2-memory bandwidth is not the bottleneck. • The bottleneck is more in the form of cpu-L1 accesses and computations. • There does exist scope for improving performance • But this would need algos which have very little overhead and simple enough to be optimized by the compiler and at the same time exploit cache coherency

References • Reordering for Cache conscious photon mapping – Josh Steinhurst • Realistic Image Synthesis using Photon Mapping – Jensen • IA-32 Intel Architecture Software Developer’s Manual, Volume 3: System Programming Guide. ftp://download.intel.com/design/Pentium4/manuals/25366814.pdf • VTune Performance Analyzer http://www.intel.com/software/products/vtune/vpa/

Query Reordering for Photon Mapping

Query Reordering for Photon Mapping

Presentation Transcript

Low Latency Photon Mapping with Block Hashing

Ray Tracing and Photon Mapping on GPUs

Schema Mapping as Query Discovery

Photon Mapping

Photon Mapping

Variable reordering strategies for SLAM

Photon Mapping

Query Health Concept Mapping Activity

Reordering

Photon Mapping on Programmable Graphics Hardware

CS690L Ontologies Interoperability (Integration, Mapping, Query)

Measuring Packet Reordering

Photon isolation for Fragmentation photon

Hashing – Reordering Schemes

Realtime Caustics using Distributed Photon Mapping

Time-Dependent Photon Mapping

Optimizing Photon Mapping Using Multiple Photon Maps for Irradiance Estimates

Adaptive Progressive Photon Mapping

Progressive Photon Mapping

Low Latency Photon Mapping with Block Hashing

CSCE 641: Photon Mapping

CPSC 641: Photon Mapping