1 / 32

Parallel Architectures: Topologies

Parallel Architectures: Topologies. Heiko Schröder, 2003. memory. processor. memory. memory. cache. memory. memory. processor. memory. processor. memory. Types of sequential processors (SISD). Von Neumann bottleneck. PE + control unit. PE + control unit. PE +

noah-mcleod
Download Presentation

Parallel Architectures: Topologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Architectures: Topologies Heiko Schröder, 2003

  2. memory processor memory memory cache memory memory processor memory processor memory Types of sequential processors (SISD) Von Neumann bottleneck

  3. PE + control unit PE + control unit PE + control unit PE + control unit PE PE PE PE PE Global control unit Interconnection network Interconnection network SIMD MIMD SIMD SPMD

  4. P P P P P PE + M control unit PE + M control unit PE + M control unit PE + M control unit M M Interconnection network Interconnection network M M Message passing /shared address space P/M

  5. Various communication networks State of the art technology Important aspects of routing schemes Known results (theory) The internet

  6. Desirable feature of a network • 1. Algorithmic • Low diameter (1, complete graph) • High bisection width (complete graph) n(n-1)/2 edges Degree n-1 • 2. Technical • Low degree (pin limitations – constant – modular – mesh) • Short wires (mesh) • Small area (mesh) • Regular structure (mesh)

  7. Connection networks I 1-D mesh (linear array) Diameter n-1 Bisection width 1

  8. Tree Diameter 2(log n) Bisection width 1

  9. H-tree Area: O(n) Longest wire :O(n) Clock distribution

  10. Diameter: Bisection width : 2-D Mesh

  11. 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 8 2 7 3 6 4 5 1 1 2 8 3 2 4 7 5 3 6 6 7 4 8 5 Torus Reduced diameter Increased bisection width All nodes equivalent Long wires?

  12. Diameter: Bisection: 3-D Mesh

  13. 0 00 10 diameter log n bisection width n/2 0-D 1-D 2-D 1 01 11 0 1 000 010 001 011 3-D 4-D 100 110 101 111 Hypercube

  14. # nodes Diameter> nodes bisection nodes Cube Connected Cycles

  15. Exchange (lsb) Shuffle (rotate -- left or right) 010 011 001 110 111 000 100 101 8-node shuffle-exchange graph Degree: 3 Diameter: 2 log n –1 : at most (log n –1) shuffles + (log n ) exchanges Bisection width: (n / log n)

  16. Exchange (lsb) ex ls+ex u1u2…uk-1uk u1u2…uk-1v1 u2…uk v1v2 … ls+ex uk v1v2…vk-1 v1v2…vk 16-node shuffle-exchange graph Degree: 3 Shuffle (rotate -- left or right) 1001 1101 1000 1100 0001 0100 0000 0101 1010 1011 1110 1111 0010 0011 0110 0111 Diameter: 2 log n –1 : at most (log n –1) shuffles + (log n ) exchanges Bisection width: (n / log n)

  17. 0 u1u2…uk-1uk u2u3…uk-1uk0 1 u1u2…uk-1uk u2u3…uk-1uk1 001 011 1 1 1 000 1 111 010 1 101 0 0 1 0 1 1 0 0 0 0 0 100 110 3-dimensional de Bruijn graph In-degree = out-degree = 2 Diameter: log n Bisection width: (n / log n) Each Eulerian tour = De Bruijn sequence = contains each possible sub-string of length 4 exactly once 1111001011010000 De Bruijn sequence

  18. Butterfly network FFT routing sorting Unique path

  19. Benes network

  20. Diameter (log n) Bisection width ( ) Mesh of trees

  21. 4-D The Power of Hypercubes • Hamiltonian cycle • Gray codes • k-D meshes (tori), N-nodes • simulates mesh of trees • simulates hypercubic networks • contains complete binary tree, almost • normal algorithms

  22. Hamiltonian Cycle A hypercube contains a Hamiltonian cycle -- proof by induction. Each Hamiltonian cycle corresponds to a Gray code (only one bit is changed per link).

  23. Gray code 00 01 11 10 000 001 011 010 110 111 101 100 0 1 reflection

  24. wrap around Hypercube contains meshes/tori 00 01 03 02 10 11 13 12 30 31 33 32 20 21 23 22 Theorem: Any n1 x n2 x … x nk mesh (with or without wrap arounds) is a sub-graph of an n-D hypercube if  ni = 2n . Proof: (see Leighton: Each sub-cube has Hamiltonian cycle)

  25. double-roots (different dimension) Hypercube contains double-rooted trees HC can implement all tree algorithms and also all mesh-of-tree-algorithms (possibly with minor delay).

  26. Normal algorithms • A hypercube algorithm is said to be normal if • only one dimension of hypercube edges is used at any step and • if consecutive dimensions are used in consecutive steps. • Most hypercube algorithms are normal. • Normal algorithms can be embedded efficiently on hypercubic networks

  27. 1 1 2 2 1 2 1 2 2 2 2 2 2 0 1 31 2 2 30 2 1 3 29 2 2 2 2 4 28 5 27 6 26 7 25 8 24 9 23 10 22 11 21 12 20 19 13 14 18 15 17 16 Josephus graph: Every even node k is connected to k+2i-3 Diameter: about (log n) / 2

  28. 1234 4231 3214 2314 2431 1324 3421 3124 4321 2134 2341 4132 3241 1243 1432 4213 3412 2413 4312 1423 1342 3142 4123 2143 Star graph: Set of nodes: k! nodes of degree k-1. Permutations of k elements. Set of edges: Exchange of first element with one other. Small degree, diameter about 2 log n . Open problems: E.g. are there (k-1)/2 edge disjoint Hamiltonian cycles? Number of nodes versus degree (Star/HC): 24, 120, 720, 4340, 34720, 312480 16, 32, 64, 128, 256, 512

  29. 4-D 12 192 • 256 16 pin - limitations 16 1

  30. wiring - limitations 4-D 12 1 216 nodes bisection width: 256 32 K 25cm 32 m

  31. Improve the topology? The internet

  32. against parallelism • cost(large) < cost (2 small) • all the FORTRAN / C software • let’s stick to pipelining • let’s wait for faster machines • Amdahl’s Law

More Related