210 likes | 224 Views
Explore the innovative DAS-3 and StarPlane project, proving the efficiency of distributed clusters, with parallel applications, grid computing, and network advancements. Discover the timeline, architecture, and impactful applications transforming the realm of network computing. ###
E N D
DAS 3 and StarPlane have Landed Architecture, Status ... Freek Dijkstra
DAS history • Project to prove distributed clusters are as effective as supercomputers • Simple Computer Science grid that works
Parallel to Distributed Computing Cluster Computing • Parallel languages (Orca, Spar) • Parallel applications Distributed Computing • Parallel processing on multiple clusters • Study non-trivially parallel applications • Exploit hierarchical structure forlocality optimizations Grid Computing
DAS-2 Usage • 200 users; 25 Ph.D. Theses • Simple, clean, laboratory-like system Example Applications: • Solving Awari (3500-year old game) • HIRLAM: Weather forecasting • GRAPE: simulation hardware for astrophysics • Manta: distributed supercomputing in Java • Ensflow: Stochastic ocean flow model http://www.cs.vu.nl/das2/
Grid Computing • Ibis: Java-centric grid computing • Satin: divide-and-conquer on grids • Zorilla: P2P distributed supercomputing • KOALA: co-allocation of grid resources • CrossGrid: interactive simulation and visualization of a biomedical system • VL-e: scientific collaboration using the grid (e-Science) • LamdaRAM: share memory among cluster nodes Applications Grid Middleware Computing Clusters + Network
Colourful Future: DAS-3 Timeline Autumn DAS-3 proposal initiated Summer Proposal accepted September European tender preparation December Tender call February Five proposals received April ClusterVision chosen June Pilot cluster at VU August Intended installation End Official ending DAS-2 Funding: NWO, NCF, VL-e (UvA, Delft, part VU), MultimediaN (UvA), Universiteit Leiden 2004 2005 2006
DAS-2 Cluster head node 1 Gbit/s Ethernet M y r i n e t To local University and wide area interconnect 100 Mb/s Ethernet 2 Gbit/s Local interconnect Fast interconnect 32-72 compute nodes
DAS-3 Cluster head node 10 Gbit/s Ethernet N o r t e l M y r i n e t To SURFnet To local University 10 Gbit/s Ethernet 1 Gbit/s Ethernet 10 Gbit/s Local interconnect Fast interconnect 32-85 compute nodes
Problem space DAS-2 CPU Data Network DAS-3 & StarPlane
SURFnet6 In The Netherlands SURFnet connects between 180: • universities; • academic hospitals; • most polytechnics; • research centers. with a user base of ~750k users ~6000km fiber comparable to railway system
Common Photonic Layer (CPL) • 5 rings • Initially 36 lambdas (4x9) • Later 72 lambdas (8x9) • Troughput of each lambda is up to 10 Gb/s now • Later up to 40 Gb/s per lambda
Quality of Service (QoS) by providing wavelengths Old Quality of Service: • One fiber, with a single lambda • Set part of it aside on request • Rest gets less service New Quality of Service: • One fiber, multiple lambda (separate colours) • Move requests to other lambdas as needed • Rest also gets happier!
StarPlane Topology • 4 DAS-3 sites, with 5 clusters • Interconnected with 4 to 8 dedicated lambdas of 10 Gb/s each • Same fiber as for regular Internet External Connectivity • Grid 5000 • GridLab • Media archives in Hilversum
StarPlane Project • StarPlane will use the SURFnet6 infrastructure to interconnect the DAS-3 sites • The novelty: to give flexibility directly to the applications by allowing them to choose the logical topology in real time • Ultimately configure within subseconds People and Timeline: • 1 postdoc, 1 AIO, 1 scientific programmer(Jason Maassen - VU; Li Xu - UvA; JP Velders - UvA) • February 2006 - February 2010 Funding: • NWO, with major contributions from SURFnet and Nortel.
Application - Network Interaction Network Application Use Request “start”, “ring”, “full mesh” Configuration Control Plane
App1 App2 App3 Application - Network Interaction Application Initiated Network Configuration Network time Work Flow Manager Workflow Initiated Network Configuration App1 App2 App3 Network time
StarPlane Applications • Large ‘stand-alone’ file transfers • User-driven file transfers • Nightly backups • Transfer of medical data files (MRI) • Large file (speedier) Stage-in/Stage-out • MEG modeling (Magneto encephalography) • Analysis of video data • Application with static bandwidth requirements • Distributed game-tree search • Remote data access for analysis of video data • Remote visualization • Applications with dynamic bandwidth requirements • Remote data access for MEG modeling • SCARI
Conclusions • This fall, DAS-3 will be available at a university near you • StarPlane allows applications to configure the network • We aim for fast (subsecond) lambda switching. • Workflow systems and/or applications need to become network aware • For details: see the StarPlane poster this evening!
DAS 3 and StarPlane have Landed Architecture, Status ... ... and Application Research
Network Memory • LambdaRAM software uses memory in the local cluster as a local cache. • Faster then caching at disk (access time ~1ms for network; ~10ms for disk) (Very) high-rez remote image Blue box: active (visualized) zoom region Green area: cached on other cluster nodes http://www.evl.uic.edu/cavern/optiputer/lambdaram.html