300 likes | 545 Views
A Parallel Delaunay algorithm for CGAL . David Millman Advisor: Sylvain Pion July 26th 2007. Goal. To create a parallel implementation of Delaunay Triangulation in R 3 with CGAL for shared memory parallel machines using OpenMP. Motivation. Delaunay’s many uses
E N D
A Parallel Delaunay algorithm for CGAL David Millman Advisor: Sylvain Pion July 26th 2007
Goal To create a parallel implementation of Delaunay Triangulation in R3 with CGAL for shared memory parallel machines using OpenMP.
Motivation • Delaunay’s many uses • Meshing in finite element theory • computational biology • geometric modeling • anything that can be done with a Voronoi diagram • Multi-Processor systems • more common • multi core systems
Motivation (cont.) • Big data sets • Robust algorithms to mesh billions of points • Sequentially CGAL • 1 processor and 16GB ram, 10 million points ~120 seconds and uses 5.5GB ram • Blandford, Belloch, Kadow ‘06 • 64 processors and 200GB ram 1 billion points 5512 seconds and used 197GB
Tools • CGAL - Computational Geometry Algorithms Library www.cgal.org • OpenMP - API for shared memory parallel programming www.openmp.org • Capricorne 2 quad core processors (8 cores) 16GB ram
CGAL Delaunay Algorithm • Locate • Find Conflict Region • Remove invalid cells • Create New Cells
Steps to Parallelization • Compact Container • Locate • Find Conflict Region • Create New Cells
Locks • OpenMP provides • Test lock • Wait lock • Priority lock • Lock and priority pair • Test lock • Priority lock
CGAL Locks • Omp_lock_traits • Export types • Lock_type • Priority_type • Constants • max_num_threads • is_parallel • Static function to handle omp functions • static void set_num_threads(int i) • static size_t get_num_threads() • static void wait_lock(Lock_type* lock) • Priority lock • bool priority_lock(Priority_type p) • bool test_lock(Priority_type p) • void unset_lock() • bool is_priority(Priority_type p) const • Omp_empty_lock_traits • Same interface
Compact Container Free List • STL like container • Pointers to 4 byte aligned objects • Iterators are not invalidated during insert and delete • Memory
MT-Compact Container • Each thread maintains its own free list • Insert • Delete • Allocate • Only lock for allocation • Size Formula • Memory Free List Where NT = number of threads
MT-Compact Container (cont.) • Old: • Compact_container<T, Allocator = Default_allocator> • New: • Compact_container<T, Allocatror = Default_allocator, Lock_type = Omp_empty_lock_traits> • No new functions • Free list array is a boost array parameterized on lock_traits::max_num_threads
Locate point p • Start at some cell, c x • Determine which face, f, of c, p is outside of z c y • Repeat with the adjacent cell that shares f with c • Continue until p is contained in the current cell
MT-Locate • Same steps as Locate, but we must lock and unlock the vertices of the cells, to avoid the cell being destroyed. x z y
Find Conflict Region • Initialize c,be the cell containing p • If p is in the circumcircle of the vertices of the c mark it as conflict • Expand until conflict region is found
MT-Find Conflict Region • Once again, same steps, but we must lock and unlock vertices to avoid deadlocks
Create New Cell • Remove cells which are in conflict creating a hole • Triangulate the hole with a star
MT-Create New Cell The same as Create New Cell • Remove cells which are in conflict creating a hole • Triangulate the hole with a star …Just release the locks at the end.
TDS • Vertex base • Old: TDS_vertex_base<TDS> • New: TDS_vertex_base<TDS, LT=Omp_empty_lock_traits> • Private derivation of Priority_lock • Functions for locking, unlocking, etc. • Cell base – no changes • TDS • Added functions to help with locking and unlocking • priority_lock_cell, priority_lock_mirror_vertex, • is_locked (vertex and cell) • lock (vertex and cell)
Triangulation_3 and Delaunay_3 • Triangulation_3 • parallel_locate(Point p, Vertex start) • vertex as hint • cell returned is locked • error_vertex • query and access functions (similar to infinite vertex) • Delaunay_3 • parallel_insert(Iterator begin, Iterator end, int num_threads)
Results Summary • Compact Container • Locate • Delaunay
Future work • Optimize • Optimize • Optimize • Optimize • Parallel mesh refinement • Mesh compression
Thank you • INRIA, NSF, REUSSI, Sylvain Pion and Chee Yap and Everyone responsible for putting this program together.