1 / 30

INFN-GRID-WP2.4: Computing Fabric & Mass Storage

Catania C. Rocca E. Cangiano Tecnologo. CNAF A. Chierici L. Dell’Agnello F. Giacomini P. Matteuzzi C. Vistoli S. Zani. Bologna G.P. Siroli P. Mazzanti. INFN-GRID-WP2.4: Computing Fabric & Mass Storage. Genova G. Chiola G. Ciaccio. LNL L. Berti M. Biasotto M. Gulmini

tambre
Download Presentation

INFN-GRID-WP2.4: Computing Fabric & Mass Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Catania C. Rocca E. Cangiano Tecnologo CNAF A. Chierici L. Dell’Agnello F. Giacomini P. Matteuzzi C. Vistoli S. Zani Bologna G.P. Siroli P. Mazzanti INFN-GRID-WP2.4: Computing Fabric & Mass Storage Genova G. Chiola G. Ciaccio LNL L. Berti M. Biasotto M. Gulmini G. Maron N. Toniolo Lecce G. Aloisio M. Cafaro Z. Zzzz L. Depaolis S. Campeggio E. Fasanelli Roma 1 D. Anzellotti C. Battista M. De Rossi F. Marzano S. Falciano A. Spanu E. Valente Torino A. Forte Padova S. Balsamo M. Bellato F. Costa R. Ferrari M. Michelotto I. Saccarola S. Ventura

  2. LAN LAN LAN LAN LAN LAN LAN Terminology PC PC network + midleware = PC Farm/Fabric PC Cluster = GRID PC Farm network + midleware

  3. Why this WP? • Commodity components like PCs and LANs are mature to form inexpensive and powerful Computing Fabric • Computing Fabrics located in different sites are integrating to form a Computational/Data GRID • But • How to design a fabric of 1000s nodes balancing computing power and efficient storage access? • How to control and monitor the basic system components? • How to “publish” the monitored values to the GRID monitor system?

  4. Computing Fabric WP • This WP wants to address the mentioned problems adding a technological tracking task to follow and to test, with real use case, the evolution of the basic constituents of a fabric. • WP Break Down • Fabric Design (INFNGrid Wp 2.4) • Overall architecture and fabric setup • LAN and SAN (System Area Network) technologies • Communication protocols for high speed network fabrics • Storage Systems • Microprocessor Technology • Fabric Management (DataGrid WP4) • Configuration management and automatic software installation • System monitoring • Dynamic System Partition • Problem management

  5. Fabric Design - INFNGrid WP 2.4 - Institutions • Dipartimento di Scienze dell’Informazione - Universita’ di Venezia • Dipartimento di Scienze dell’informazione - Universita’ di Genova • Ingegneria Informatica – Universita’ di Lecce • 8 sezioni INFN, 1 Laboratorio Nazionale

  6. Fabric Management - DataGrid WP4 - Institutions • CERN • Konrad Zuse Zentrum (Berlin) • Kirchhoff Institute (Heidelberg) • IN2P3 (Lyon) • INFN • Nikhef • RAL

  7. Fabric Design Detailed Program (2000-2001) • Fabric Architecture • Network topologies • Data Server connections and network file systems • System break down • Interconnection Networks • 100/1000 Ethernet • Myrinet • Infiniband • Communication protocols for high speed network fabrics • Storage Systems • Ultra SCSI (160/320/…) • Ultra and Serial ATA • Storage Area Network (SAN) • Fibre Channel • Microprocessor Technology • Dual slim processors • IA64

  8. Fabric Design Deliverables

  9. Fabric Design Milestone

  10. DATA GRID Fabric Management WP4 • Automatic Software Installation and Management (OS and Applications) • Configuration Management • System Monitoring • Problem Management • Local Authorization Services and Grid Integration

  11. WP4 Fabric Managemet Deliverables • Requirements document and survey of existing tools and technologies (month 6) • A configuration and installation management demonstrated to work on a cluster of more than 100 nodes (month 12) • A fully deployed service level monitoring system for a computer centre. Hooks to provide remote requests for meta information like policies and quality measures, to allow schedule decisions (month 24) • A fully integrated system to accept remote resource requests in the form of tape mounts r jobs to run and provide monitoring information about progress og requests, and final accounting report back to the sender (month 36)

  12. N1 SWITCH FE N2 GE N10 Farm Module NFS Disk Servers Based Farm (Legnaro) LNL is testing this farm module (with PIII at 450 MHz) • Computational Nodes: • Dual PIII 800 MHz (40+40 SI95) • Mem 512 MB • Disk Server: • Dual PIII 800 MHz • Mem 512 MB • SCSI Adapters • Ethernet Switch: • 10 FE • 1 Uplink GE DISK SERVER ANIS Boot Server This Farm has been funded By Comm Calcolo for LNL Off-line computation (gr. 2/3) Requests for 2000: - 2 Raid SCSI Controllers 10 Ml

  13. LNL PC Farm (phase I – May 200) 2 ULTRA 160 SCSI ADAPTEC 39160 Farm Module I N1 SWITCH FE N2 ExtremeNetworks Summit 48 48 Fe 2 Ge DISK SERVER GE N10 FrontEnd (ANIS) SuperMicro PIIIDME 840 chipset 1-2 PIII 600 MHz PCI 33/32-66/64 ASUS P3BF 440 BX chipset 1 PIII 450 MHz PCI 33/32 Problemi con Memoria SDRAM

  14. LNL PC Farm (phase II – Nov 2000) Raid Controller U3 Compaq SmartArray 5300 Farm Module I N1 SWITCH FE N2 ExtremeNetworks Summit 48 48 Fe 2 Ge DISK SERVER DISK SERVER GE N15 Enclosure Compaq 4354R 14 dischi 36GB SCSI U3 500 Gbyte Storage Array SuperMicro 370DLE ServerWorks III LE chipset 2 PIII 800 MHz 512 Mbyte PCI 33/32-66/64 Case minitower Case Test: SuperMicro SC810

  15. Low Price 1U case example: SM SC810

  16. CMS Fabric at LNL (2001) N10 N2 N1 N10 N2 N1 N10 N2 N1 N10 N2 N1 HP2524 FE FE FE FE SWITCH SWITCH SWITCH SWITCH Farm Module Farm Module Farm Module Farm Module HP8000 FastIron 4000 GigaEthernet Switch DISK SERVER DISK SERVER DISK SERVER DISK SERVER

  17. LNL Farm • 32 PCs, PIII 600 MHz, 3 Farm modules, • 120 Gbps GE/Cu switch, 500 Gbyte disk • Used for: • Sadirc/CMS Event Builder prototypes • Data Analysis Production for LNL exps • - NFS and topological tests

  18. NFS Distributed Disks based Farm (CNAF) N1 SWITCH FE N2 ANIS Boot Server N10 Farm Module • Topology tests • dual slim processors (rack mounted) tests • remote file systems (nfs, ams, etc. ) tests • Requests for 2000: • 10 dual slim processors 90 Ml • Fast ethernet switch 10 Ml • 10 SCSI disks 36/72 GB 20 Ml • ------ • 120 Ml

  19. SCSI Fibre Channel 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 8/10Gb 1000 MB/sec Ultra640 640 MB/sec Ultra320 320 MB/sec 2Gb 200 MB/sec 400MB/sec (full duplex) Ultra3/Ultra160 160 MB/sec 1Gb 100 MB/sec Ultra2 80 MB/sec Wide Ultra 40 MB/sec Storage Systems: FC and SCSI road map

  20. Storage Systems: Fiber Channel/SCSI evaluation (PD) Arbitrated loop (AL) FC • Requests for 2000 • 4 Disk Servers 30 Ml • 1 8 ports FC Switch 20 Ml • 4 FC adapters 15 Ml • 2 Raid Arrays 30 Ml • 4 Ultra/160 • SCSI disks 8 Ml • ------ • 103 Ml N1 Switched FC N1 FCSWT FC N2 SCSI N1 N3 N4

  21. Storage Systems: Serial ATA (Padova) Serial ATA N1 • Requests for 2001 • 4 ATA Cntr. 5 Ml

  22. Interconnection Networks: Gigabit Ethernet (Genova) • Communication Protocols for high speed networks: • Communication System GAMMA • Use of Programmable GigaEthernet NICs • 1000 Base T GigaEthernet (on copper) reliability evaluation • Implementation of efficient parallel/distributed RAID systems • Requests for 2000 • 8 PC 24 Ml • 1 12 ports Geth Switch 30 Ml • 8 GEth NICs 16 Ml • ------ • TOTAL 70 Ml

  23. Interconnection Networks: Myrinet • Full-duplex 2.5+2.5 Gigabit/second links, switch ports, and interface ports • Flow control, error control, and “heartbeat” continuity monitoring on every link • Low latency, cut-through crossbar switches

  24. Interconnection Networks: Myrinet (Lecce) N1 M Y R I N E T Myrinet based farm - Disk Server based - Distributed disks based N2 DISK SERVER N10 • Requests for 2000 • 10 Biproc Comp. Nodes 50 Ml • 1 16 ports Myrinet Switch 8 Ml • 10 Myrinet NICs 35 Ml • 10 disks 72 GB 35 Ml • 1 disk server 10 Ml • Rack + cables 5 Ml • ------ • TOTAL 143 Ml N1 M Y R I N E T N2 N10

  25. Interconnection Networks: InfiniBand (I) TODAY

  26. InfiniBand (II) The Infiniband Model Legacy host architecture

  27. InfiniBand (III) • What? • Initial single link signaling rate of 2.5Gbaud • Means unidirectional transfer rate of 250MB/sec with a theoretical full duplex rate of 500MB/sec • Initial support for single, 4, and 12 wide link widths • Point to point switched fabric • Message based with multicasting support Fibre Channel Memory Multi Stage Switch CPU CPU CPU CPU SCSI Link MemoryController HCA Gig. Ethernet PCI-X Host Bridges Links TCA I/O Controller HCA- Host Channel Adapter TCA - Target Channel Adapter

  28. Interconnection Networks: InfiniBand (LNL) • Simple test system (4 servers + Storafe Area Network + Network) for 2001 is possible • Early access to the products • Test Beds • Requests 2001 (valuations)  Comm V (Sadirc2000) • 4 servers 50 Ml • 1 IBA Switch 20 Ml • IBA Adapters 10 Ml

  29. Microprocessor Technology: Intel Itanium (IA64) • Requests for 2001 • Padova: Investigation IA64 • 4 PC IA64 40 Ml • Lecce: IA64 in Myrinet • 10 Dual IA64 100 Ml • 1 Disk Server + disks 16 Ml

  30. Resources

More Related