90 likes | 244 Views
Claude Tadonki Mines ParisTech – CRI – Mathématiques et Systèmes Laboratoire de l’Accélérateur Linéaire/IN2P3/CNRS France claude.tadonki@u-psud.fr. 2nd Workshop on Architecture and Multi-Core Applications
E N D
Claude Tadonki Mines ParisTech – CRI – Mathématiques et Systèmes Laboratoire de l’Accélérateur Linéaire/IN2P3/CNRS France claude.tadonki@u-psud.fr 2nd Workshop on Architecture and Multi-Core Applications 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011) October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on SupercomputersC. TADONKI The Kronecker product (définition and applications) 2nd Workshop on Architecture and Multi-Core Applications 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011) October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on SupercomputersC. TADONKI The Kronecker product (properties and problem formulation) 2nd Workshop on Architecture and Multi-Core Applications 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011) October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on SupercomputersC. TADONKI The Kronecker (complexity and recurrenceequation) • Forming the matrix first would • require a huge amount of memory • yield lot of redundant multiplication, which in total would be Using the so-called normal factorization, we could derive an optimal scheme which reduces the number of floatting point multiplication to 2nd Workshop on Architecture and Multi-Core Applications 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011) October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on SupercomputersC. TADONKI The Kronecker product and its applications 2nd Workshop on Architecture and Multi-Core Applications 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011) October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on SupercomputersC. TADONKI Performance issues and heuristic for finding a good topology • The total (parallel) execution time depends on • the sizes of the matrices • the gap between virtual topology and physical topology • the way the task is splitted among the processors (decomposition) 2nd Workshop on Architecture and Multi-Core Applications 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011) October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on SupercomputersC. TADONKI Performances Weconsider N = 6 matrices of orders30, 36, 32, 18, 24, 16, thus L = 159 252 480 • We see that • our heuristic yields a significant improvment compare to trivial decompositions • we start loosing the scalabily when the number of cores increases (com) • We the turn to hybrid implementation 2nd Workshop on Architecture and Multi-Core Applications 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011) October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on SupercomputersC. TADONKI Performance of the hybridimplementation • We see that • the hybrid implementation is better for larger number of cores • for smaller number of cores, the SM implemntation exacerbates on cache misses • Need to investigate on the compromise and a better memory layout. 2nd Workshop on Architecture and Multi-Core Applications 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011) October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on SupercomputersC. TADONKI END & QUESTIONS 2nd Workshop on Architecture and Multi-Core Applications 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011) October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.