1 / 17

Real Parallel Computers

Learn about the background, recent trends, and performance development in high-performance computing, including clusters, GPUs, and advanced architectures like BlueGene and Earth Simulator.

Download Presentation

Real Parallel Computers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Real Parallel Computers

  2. Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel Computing 2005

  3. Short history of parallel machines • 1970s: vector computers • 1990s: Massively Parallel Processors (MPPs) • Standard microprocessors, special network and I/O • 2000s: • Cluster computers (using standard PCs) • Advanced architectures (BlueGene) • Comeback of vector computer(Japanese Earth Simulator) • GPUs, IBM Cell/BE

  4. Performance development and predictions

  5. Clusters • Cluster computing • Standard PCs or workstations connected by a fast network • Good price/performance ratio • Exploit existing (idle) machines or use (new) dedicated machines • Cluster computers versus supercomputers (MPPs) • Processing power is similar: based on microprocessors • Communication performance was the key difference • Modern networks (Myrinet, Infiniband) have bridged this gap

  6. Overview • Cluster computers at our department • DAS-1: 128-node Pentium-Pro / Myrinet cluster (gone) • DAS-2: 72-node dual-Pentium-III / Myrinet-2000 cluster • DAS-3: 85-node dual-core dual Opteron / Myrinet-10G cluster • Part of a wide-area system: Distributed ASCI Supercomputer

  7. Distributed ASCI Supercomputer(1997-2001)

  8. DAS-1 node configuration • 200 MHz Pentium Pro • 128 MB memory • 2.5 GB disk • Fast Ethernet 100 Mbit/s • Myrinet 1.28 Gbit/s (full duplex) • Operating system: Red Hat Linux

  9. DAS-2 Cluster (2002-2006) • 72 nodes, each with 2 CPUs (144 CPUs in total) • 1 GHz Pentium-III • 1 GB memory per node • 20 GB disk • Fast Ethernet 100 Mbit/s • Myrinet-2000 2 Gbit/s (crossbar) • Operating system: Red Hat Linux • Part of wide-area DAS-2 system (5 clusters with 200 nodes in total) Ethernet switch Myrinet switch

  10. DAS-3 Cluster (Sept. 2006) • 85 nodes, each with 2 dual-core CPUs(340 cores in total) • 2.4 GHz AMD Opterons (64 bit) • 4 GB memory per node • 250 GB disk • Gigabit Ethernet • Myrinet-10G 10 Gb/s (crossbar) • Operating system: Scientific Linux • Part of wide-area DAS-3 system (5 clusters with 263 nodes in total),using SURFnet-6 optical network with 40-80 Gb/s wide-area links

  11. DAS-3 Networks Nortel 5530 + 3 * 5510 ethernet switch 85 compute nodes 85 * 1 Gb/s ethernet Nortel 1 or 10 Gb/s Campus uplink 10 Gb/s ethernet 8 * 10 Gb/s eth (fiber) 85 * 10 Gb/s Myrinet 80 Gb/s DWDM SURFnet6 Nortel OME 6500 with DWDM blade Myrinet 10 Gb/s Myrinet 10 Gb/s ethernet blade Myri-10G switch Headnode (10 TB mass storage)

  12. DAS-1 Myrinet Components: • 8-port switches • Network interface card for each node (on PCI bus) • Electrical cables: reliable links Myrinet switches: • 8 x 8 crossbar switch • Each port connects to a node (network interface) or another switch • Source-based, cut-through routing • Less than 1 microsecond switching delay

  13. 24-node DAS-1 cluster

  14. 128-node DAS-1 cluster • Ring topology would have: • 22 switches • Poor diameter: 11 • Poor bisection width: 2

  15. PC PC PC PC Topology 128-node cluster • 4 x 8 grid withwrap-around • Each switch is connectedto 4 other switchesand 4 PCs • 32 switches (128/4) • Diameter: 6 • Bisection width: 8

  16. Performance • DAS-2: • 9.6 μsec 1-way null-latency • 168 MB/sec throughput • DAS-3: • 2.6 μsec 1-way null-latency • 950 MB/sec throughput

  17. MareNostrum: large Myrinet cluster • IBM system at Barcelona Supercomputer Center • 4812 PowerPC 970 processors, 9.6 TB memory (2006)

More Related