A Parallel Delaunay algorithm for CGAL

A Parallel Delaunay algorithm for CGAL David Millman Advisor: Sylvain Pion July 26th 2007

Goal To create a parallel implementation of Delaunay Triangulation in R3 with CGAL for shared memory parallel machines using OpenMP.

Motivation • Delaunay’s many uses • Meshing in finite element theory • computational biology • geometric modeling • anything that can be done with a Voronoi diagram • Multi-Processor systems • more common • multi core systems

Motivation (cont.) • Big data sets • Robust algorithms to mesh billions of points • Sequentially CGAL • 1 processor and 16GB ram, 10 million points ~120 seconds and uses 5.5GB ram • Blandford, Belloch, Kadow ‘06 • 64 processors and 200GB ram 1 billion points 5512 seconds and used 197GB

Tools • CGAL - Computational Geometry Algorithms Library www.cgal.org • OpenMP - API for shared memory parallel programming www.openmp.org • Capricorne 2 quad core processors (8 cores) 16GB ram

CGAL Delaunay Algorithm • Locate • Find Conflict Region • Remove invalid cells • Create New Cells

Steps to Parallelization • Compact Container • Locate • Find Conflict Region • Create New Cells

Locks • OpenMP provides • Test lock • Wait lock • Priority lock • Lock and priority pair • Test lock • Priority lock

CGAL Locks • Omp_lock_traits • Export types • Lock_type • Priority_type • Constants • max_num_threads • is_parallel • Static function to handle omp functions • static void set_num_threads(int i) • static size_t get_num_threads() • static void wait_lock(Lock_type* lock) • Priority lock • bool priority_lock(Priority_type p) • bool test_lock(Priority_type p) • void unset_lock() • bool is_priority(Priority_type p) const • Omp_empty_lock_traits • Same interface

Compact Container Free List • STL like container • Pointers to 4 byte aligned objects • Iterators are not invalidated during insert and delete • Memory

MT-Compact Container • Each thread maintains its own free list • Insert • Delete • Allocate • Only lock for allocation • Size Formula • Memory Free List Where NT = number of threads

MT-Compact Container (cont.) • Old: • Compact_container<T, Allocator = Default_allocator> • New: • Compact_container<T, Allocatror = Default_allocator, Lock_type = Omp_empty_lock_traits> • No new functions • Free list array is a boost array parameterized on lock_traits::max_num_threads

Locate point p • Start at some cell, c x • Determine which face, f, of c, p is outside of z c y • Repeat with the adjacent cell that shares f with c • Continue until p is contained in the current cell

MT-Locate • Same steps as Locate, but we must lock and unlock the vertices of the cells, to avoid the cell being destroyed. x z y

Find Conflict Region • Initialize c,be the cell containing p • If p is in the circumcircle of the vertices of the c mark it as conflict • Expand until conflict region is found

MT-Find Conflict Region • Once again, same steps, but we must lock and unlock vertices to avoid deadlocks

Create New Cell • Remove cells which are in conflict creating a hole • Triangulate the hole with a star

MT-Create New Cell The same as Create New Cell • Remove cells which are in conflict creating a hole • Triangulate the hole with a star …Just release the locks at the end.

TDS • Vertex base • Old: TDS_vertex_base<TDS> • New: TDS_vertex_base<TDS, LT=Omp_empty_lock_traits> • Private derivation of Priority_lock • Functions for locking, unlocking, etc. • Cell base – no changes • TDS • Added functions to help with locking and unlocking • priority_lock_cell, priority_lock_mirror_vertex, • is_locked (vertex and cell) • lock (vertex and cell)

Triangulation_3 and Delaunay_3 • Triangulation_3 • parallel_locate(Point p, Vertex start) • vertex as hint • cell returned is locked • error_vertex • query and access functions (similar to infinite vertex) • Delaunay_3 • parallel_insert(Iterator begin, Iterator end, int num_threads)

CC Results (cont.)

Compact Comtainer Results

Locate Results

Locate Results (cont.)

Delaunay Results

Delaunay Results (cont.)

Results Summary • Compact Container • Locate • Delaunay

Future work • Optimize • Optimize • Optimize • Optimize • Parallel mesh refinement • Mesh compression

Thank you • INRIA, NSF, REUSSI, Sylvain Pion and Chee Yap and Everyone responsible for putting this program together.

A Parallel Delaunay algorithm for CGAL

A Parallel Delaunay algorithm for CGAL

Presentation Transcript

Principles of Parallel Algorithm Design

Delaunay Triangulations

A Parallel Delaunay algorithm for CGAL

Parallel Algorithm Design

A Parallel Algorithm for Construction of Uniform Grids

PARALLEL JACOBI ALGORITHM

A Parallel Genetic Algorithm FOR Predictive Job Scheduling

PGA – Parallel Genetic Algorithm

Parallel Algorithm Construction

Parallel ICA Algorithm and Modeling

Delaunay Triangulations

A Parallel Algorithm for Numerical Simulations of

Parallel Algorithm Oriented Mesh Datastructure

Parallel Algorithm Configuration

A Parallel Search Algorithm for CLNS Addition Optimization

A Parallel Algorithm for Hardware Implementation of Inverse Halftoning

Star Splaying: An algorithm for Repairing Delaunay Triangulations and Convex Hulls

Parallel Search Algorithm

Parallel MIMD Algorithm Design

Chapter 6 Parallel Sorting Algorithm

3 Parallel Algorithm Complexity

A Parallel Algorithm for Hardware Implementation of Inverse Halftoning