110 likes | 126 Views
Explore a ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine, showcasing high performance computing techniques and implementation considerations.
E N D
Claude Tadonki Laboratoire de l’Accélérateur Linéaire/IN2P3/CNRS University of Orsay Orsay / France claude.tadonki@u-psud.fr 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI The Algebraic Path Problem 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI The Warshall-Floyd Algorithm 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI Shift-toroïdal Reindexation ( Kung-Lo-Lewis, 1987) 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI The CELL Broadband Engine 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI Ring Pipelined Algorithm for the APP ( algorithm ) 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI Ring Pipelined Algorithm for the APP ( algorithm ) Interestingproperties of ouralgorithm Can runwithanynumber of processors p <= N ( natural LPGS ) Generictilingapplies ( LSGP by blocking ) Each processor onlyrequires a buffer of size bN ( Block of size b ) Fullypipelinedprocesswith local synchronizationonly Perfect computation-communication overlap 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI Ring Pipelined Algorithm for the APP ( implementation on the CELL BE ) PPE-DMA is issued only by the first and the last processor Inner SPEs communicate and synchronize locally Computation-communication overlap occurs for all communications Can run on more SPEs or CELL Blades by natural extension 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI Performances 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI Conclusion and Perspectives Our ring SPMD algorithm suits for the CELL BE with a good scalability Communication and synchronization yield less than 5% overhead Absolute performance can be improved by optimizing the APP kernel Close to 80% of the peak performance expected Our scheduling can be applied to similar problems 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI END & QUESTIONS 1st Workshop on Applications for Multi and Many Core Architectures 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010) October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.