100 likes | 237 Views
The Distributed ASCI Supercomputer. The third generation. Dick Epema (TUD) (with many slides from Henri Bal). Parallel and Distributed Group. Distributed ASCI Supercomputer. Joint infrastructure of the ASCI research school Clusters integrated in a single distributed testbed
E N D
The Distributed ASCI Supercomputer The third generation Dick Epema (TUD) (with many slides from Henri Bal) Parallel and Distributed Group
Distributed ASCI Supercomputer • Joint infrastructure of the ASCI research school • Clusters integrated in a single distributed testbed • Long history and continuity DAS-1 (1997) DAS-2 (2002) DAS-3 (2006)
DAS is a Computer Science grid • Motivation: CS needs its own infrastructure for • Systems research and experimentation • Application experiments • DAS is simpler and more homogeneous than most production grids • Single operating system • “A simple grid that works’’
DAS-3: overall structure • 272 AMD Opteron nodes • 792 cores, 1TB memory • more heterogeneous: • 2.2-2.6 GHz single/dual core nodes • Myrinet-10G (not in Delft) • Gigabit Ethernet UvA/MultimediaN (46) VU (85 nodes) UvA/VL-e (40) SURFnet6 Operational: oct. 2006 10 Gb/s lambdas TU Delft (68) Leiden (32)
Projects using DAS-3 • Virtual Lab for e-Science • Grid computing, scheduling, workflow, PSE, visualization • MultimediaN • Searching, classifying multimedia data • NWO projects, e.g., StarPlane and GUARD-G • NCF projects (off-peak hours) • And many more (P2P, …)
CPU’s R CPU’s R CPU’s R NOC CPU’s R CPU’s R Projects using DAS: StarPlane • Key idea: • Applications can dynamically allocate light paths • Applications can change the topology of the wide-area network, possibly even atthe sub-second timescale • VU (Bal, Bos, Maassen) • UvA (de Laat, Grosso, Xu, Velders)
Projects using DAS: GUARD-G • How to turn grids into a predictable utility for computing (much like the telephone system) • Problems: • Predictability of workloads • Predictability of system availability (grids are faulty!) • Allocation of light paths very useful here • TU Delft (Epema) + Leiden (Wolters)
Projects using DAS: KOALA, a co-allocating grid scheduler • Main goals: • processor co-allocation: (non-)fixed/flexible jobs • data co-allocation: move large input files to the locations where the job components will run prior to execution • load sharing: in the absence of co-allocation • run alongside local schedulers • KOALA • is written in Java • uses Globus components (e.g., GRAM, RSL and GridFTP) • has been deployed on the DAS2 in september 2005
Status DAS3 clusters • Delft cluster: accepted, up and running • VU, UvA-MM: acceptance this week • UvA, Leiden: acceptance this year