230 likes | 239 Views
Explore the use of superclusters in astrophysical research and special effects, with a focus on the Swinburne Centre for Astrophysics and Supercomputing. Discover the benefits, challenges, and breakthroughs in using superclusters for advanced computational tasks.
E N D
Astrophysical ApplicationsonSuperclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing
Outline • No: • Linpack Mflops • latencies • bandwidths • evangelism • Why a Supercluster? • What is the Supercluster? • How do we use the Supercluster? • What does it do?
Why a Supercluster? • Swinburne wants reputation. • Hypothesis: • 30 times the power • Six years of Moore’s law • We can do problems 30x as complex as other groups.
Centre Goals: • Fundamental Research. • Public Outreach and Education. • Commercial Supercomputing. • Astrophysical Special Effects • Cluster Monitoring Tools • Commercial Rendering
What is the Supercluster? • Supercluster sounds better than Beowulf if you are an astronomer. • Design Goals SSI I (1998): • No one component worth more than A10K • Order of magnitude more than single workstation. • Dedicated resource. (dispel various myths) • 10 GB scratch/node. • 10 MB/s IO node-node. • Decent fortran/C/C++ compiler.
Case Study: CSIRO Astronomy • 1984: VAX 11/780 • 1989: Convex C2 ( > 10 times speed up) • 1995: Power Challenge ( 10 processors ) • 1999: Linux Boxes • Unless package supports parallelism, users won’t use clusters or even SMP/Numa unless their science is obviously constrained.
Theorists: • Possess and use clusters effectively. • Know what MPI is. • Can’t get money.
SSI I (Jan 1998) • 16 DEC 500 MHz alphas • 2MB cache • 192 MB RAM • 13 GB disk • 24-port CISCO switch • MPICH/f77/C++/FFTw/emacs/gcc Zeroeth Law of Cluster Computing: Cluster Computing is inevitable if your budget is finite.
SSI II (Nov 1998). • SSI I + 8 x 600 MHz DECs 4 MB cache. First Law of Cluster Computing: Your cluster soon becomes hetereogeneous. Corollary: Your first cluster is your happiest.
SSI III (March 1999) • SSI II + • 41 500 MHz ev6 processors • 512 MB RAM/node • 18 GB disk/node • CISCO 5500 switch • 3.2 Gb/s backplane • Virtual Reality Theatrette • Seats 37 Second Law of Cluster Computing: MTBF = MTBF0/N
How do we use the Supercluster? • Linux Workstations. (despite free OS) • No batch system (just 3 “power” users). • Home-grown MPI programs. • C++/fortran/java.
Problems: • Distributed TB disk rarely has > 10% free. • MPI hangs on FPE or “p4pg” errors. • CPUs too powerful for fast ethernet and tape drive on some applications. • Difficult to monitor.
17 MB 256MB FFT Search Fold Save Applications. • Neutron Star Searches. • Looked at 10% of the Southern Sky • Recorded 1.4 TB in 21 days. • 1 ev56 workstation take 7 years. • SSI III took 25 days. • Discovered 7 “millisecond” pulsars. • Could scale to 1000 nodes on TCP/IP.
Discovery Implications: • Discovered most relativistic Neutron Star + white dwarf binary known. • Emit gravitational waves • Coalesce in 7 Gyr. • Population of ultra-relativistic systems.
Problems. • Most interesting systems are relativistic. • Full sensitivity requires coherent addition. • If observation time > 10 minutes, computational penalty becomes very large.
Coherent Dedispersion. • Problem: • Cosmic Signals are Weak • Cosmic radio signals propagate at v!=c • In 1971 new method proposed: • record electric field • Apply numerical filter to it.
What does this mean? • 20 MHz = 20 MB/second. • 200 times real time to process (ev6) • Gives 50 nanosecond time resolution • Need 7*8 hour observations to do science • One node 1.5 yr • 50 nodes 9 days • 1985 VAX 11/780 (one century)
Discovered? • Millisecond pulsars emit short (1us wide) pulses across GHz bandwidths • Implies seed areas of 30 cm or less • PSR 0437-4715 in a 5.7 day orbit • 1 Mkm in radius b a a-b = 180.1 mm
Future: • Search for us wide pulses in SN 1987A • 25 day search • HIPASS - 600 GB in < 12 hours. • SSI III + servernet can mimic CSIRO’s correlator • SSI IV: • ES40 + TB disk • SSI V: • 128 nodes + Inifiniband/servernet II?
Conclusions: • Clusters are too hard to code for most astronomers. MPIwhat? • Breakthroughs are possible with radical increases in computer power. www.swin.edu.au/astronomy