260 likes | 277 Views
Explore CYBERSAR, a cooperative effort providing computational support to RTD activities. Learn about its dual purpose, members, funding, and network and computing cluster anatomy.
E N D
Gianluigi Zanetti (gianluigi.zanetti@crsr4.it) Cybersar RTD on computational infrastructure
CYBERSAR: a cooperative effort for a shared HPC facility • Dual purpose • Computational Support to RTD • RTD on computational infrastructure • Members of the co-op • CRS4, INAF, INFN, UNICA, UNISS, Nice, Tiscali • Partially Funded by MIUR • March 2006 – December 2008 • Supported by RAS (Sardinian Regional Authority) • access to dark-fibers & regional network INFN AOB
Cybersar: a cooperative effort for a shared HPC facility • Dual purpose • Computational Support to RTD • RTD on computational infrastructure • Members of the co-op • CRS4, INAF, INFN, UNICA, UNISS, Nice, Tiscali • Partially Funded by MUR • March 2006 – December 2008 • Supported by RAS (Sardinian Regional Authority) • access to dark-fibers & regional network INFN AOB
INFN AOB Cybersar Cluster anatomy • Standard node configuration • 48 x (2x AMD F2218, 8GB RAM, 500 GB HD, 2 1GbE) • 24 x (2x AMD F2218, 16GB RAM, 500 GB HD, IB) Storage 40TB
INFN AOB Cybersar Network anatomy
‘Computable’ deep computing stack reconfiguration application middleware OS OS OS OS HW HW HW HW Network application application middleware middleware OS OS OS OS HW HW HW HW Network Network
Configuration computing: atomic operations and programs • ‘Bare hardware’ services • IPMI control (boot/shutdown) • Remote install/repair • VLAN setup/configuration • Virtualization layer services • VM deployment • VM cluster deployment • Orchestration services • WS-BPEL • Jolie • Specialized scripts • Modified GE/AR for resource allocation and scheduling
A research problem • Work on massive collections of ‘data with structure’ • Biology data • New sequencing machines (e.g., 454) produce 1TB/3days each • Medical data • Modalities at > 1GB per acquisition are now common • High definitions 3d models collections • Fast, very high resolution, 3D laser scanners • Terrains data • ~10cm resolution, country size • Have configurable computing & network Adapt computing facility to problem
Massively parallel sequencing (1/5) http://www.454.com 400,000 wells * 300 bases / 7hr 1Terabases /3 days
Massively parallel sequencing (2/5) • Large data accumulations in productions sites • E.g., Sanger Center can produce 300TB/month • Deep, highly non-linear, information content • Complex data analysis based on whole dataset • Analysis method itself object of research • Multiple computational cultures & approaches
Massively parallel sequencing (3/5) storage Data production center
storage Data production center Massively parallel sequencing (4/5) Computing site A NETWORK 3 Tbytes / (10Gbit/sec ) ~ 1 hr Computing site B
Computing site B (virtual) NETWORK Massively parallel sequencing (5/5) Lab A storage xml + images ~ 4Gbytes 5 secs Computing site A (virtual) Data production center
State of the art CRS4@cybersar traditional cluster 4’ from scratch to running 72 phys. nodes 8’ from scratch to running 288 vn on 72 phys. n. control plane computer Computing site (virtual) non traditional computing e.g., map-reduce facility for biocomputing virtual computing site hosting
map-reduce cluster 288 virtual nodes P2P storage sim. 988 virtual nodes gLite cluster 32 virtual nodes bio-cluster EET 25 virtual nodes production experimental Virtual bubble clusters CRS4@cybersar • vs grid ‘5000 • vs Science Clouds
Configurable computing facilities • Computing pipeline • Not necessarily a single large machine • One of the clusters is ‘attached’ to the data • Clusters can be dynamically configured • User cluster configuration in sand-box, safe for hosting facility • Network is abundant • Dynamic allocation of lambda (a la OptIPuter)
Reference application: “holodeck” 50Mpixel 0.8° resolution50-70° FOV
Light field display Front-end PC Back-end Rendering Cluster RAM 3 5 9 1 GPU1 LRU Cache (Ids of current working set chunks) (Geometry/color for new chunks) Cache 2 Partitioning and simplification Mmap I/O Adaptive multires. renderer Holovizio wrapper front-end Holovizio wrapper back-end Network 50Mpixel 0.8° resolution50-70° FOV GPUm n Off-line Cache 4 7 2 Multiresolution DAG Cut Multiresolution structure (data+dependency) Agocs, Balogh, Forgács, Bettio, Gobbetti, and Zanetti. A Large Scale Interactive Holographic Display. In Proc. IEEE VR 2006 Workshop on Emerging Display Technologies, 2006. CD ROM Proceedings.
Conclusions & future • CYBERSAR Good platform to support research on • Computing control plane • Virtual clusters • OptIPuter architecture • Future work • Virtual cluster appliances for genome analysis • Virtualization of coprocessors (GPU, FPGA,…) • Use GARR-X for National & European scale test and developments INFN AOB
`HPC for the masses’ Massive amounts of data are looming… questa non è detto che sia una nuvola… Digital camera 39Megapixel (6726x5040 px) Lightfield display ‘real 3d’