240 likes | 404 Views
Cyberinfrastructure @UAB and beyond. Office of Vice President for Information Technology. NSF Cyberinfrastructure (CI) Vision http://www.nsf.gov/od/oci/CI_Vision_March07.pdf.
E N D
Cyberinfrastructure @UAB and beyond Office of Vice President for Information Technology
NSF Cyberinfrastructure (CI) Visionhttp://www.nsf.gov/od/oci/CI_Vision_March07.pdf • Supports broad and open access to leadership computing; data and information resources; online instruments and observatories; and visualization and collaboration services. • Enables distributed knowledge communities that collaborate and communicate across disciplines, distance and cultures. • Research and education communities become virtual organizations that transcend geographic and institutional boundaries.
CI Complimentary Areashttp://www.nsf.gov/od/oci/CI_Vision_March07.pdf • HPC in support of modeling, simulations, and extraction of knowledge from huge data collections. NSF will invest in 0.5-10 petascale perform ranges where petascale is 1015 operations per second with comparable storage and networking capacity. • Data, Data Analysis, and Visualization • Virtual Organizations for Distributed Communities • Leaning and Workforce Development covering K-12, post-secondary, the workforce, and general public.
UAB CyberInfrastructure • UAB HPC Resources • The Shared HPC Facility is located in BEC155 has 4 clusters • Computer Science HPC Facility has 2 clusters • UAB overall HPC computing power has been tripling approximately on a 2 year cycle during the past 4 years. • Optical Networks – campus & regional • UABgrid – a campus computing and collaboration environment
UAB HPC Resources • IBM BlueGene/L System • IBM’s BlueGene/L is a uniquely designed massively parallel supercomputer. A single BlueGene/L rack contains 1024 nodes, each node having two processors and 512MB of main memory. The 2048 processors in one BlueGene/L rack are tightly integrated in one form factor using five proprietary high-speed interconnection networks. This system has a theoretical 5.6-Teraflop computing capacity. • DELL Xeon 64-bit Linux Cluster – “Coosa” • This cluster consists of 128 nodes of DELL PE1425 computer with dual Xeon 3.6GHz processors with either 2GB or 6GB of memory per node. It uses Gigabit Ethernet inter-node network connection. There are 4 Terabytes of disk storage available to this cluster. This cluster is rated at more than 1.0 Teraflops computing capacity. • DELL Xeon 64-bit Linux Cluster w/ Infinitband – “Olympus” • 2- Verari Opteron 64-bit Linux Clusters – “Cheaha” & “Everest” • This cluster is a 64-node computing cluster consists of dual AMD Opteron 242 processors, with 2GB of memory each node. Each node is interconnected with a Gigabit Ethernet network. • IBM Xeon 32-bit Linux Cluster – “Cahaba” • This cluster is a 64-node computing cluster consists of IBM x335 Series computer with dual Xeon 2.4GHz processors, 2 or 4GB of memory each node) and a 1-Terabyte storage unit. Each node is interconnected with a Gigabit Ethernet network
BlueGene Cluster DELL Xeon 64-bit Linux Clusters – “Coosa” & “Cahaba” Verari Opteron 64-bit Linux Clusters – “Cheaha” & “Everest”
Computer Science DELL Xeon 64-bit Linux Cluster w/ Infinitband – “Olympus”
UAB 10GigE Research Network • Build high bandwidth network linking UAB compute clusters • Leverage network for staging and managing Grid-based compute jobs • Connect directly to High-bandwidth regional networks
UABgrid • Common interface for access to HPC infrastructure • Leverage UAB identity management system for consistent identity across resources • Provide access to regional, national, and international collaborators using Shibboleth identity framework • Support research collaboration through autonomous virtual organizations • Collaboration between computer science, engineering, and IT
UABgrid Architecture • Leverages IdM investments via InCommon • Provides collaboration environment for autonomous virtual organizations • Supports integration of local, shared, and regional resources
Alabama Cyberinfrastructure • Unlimited bandwidth optical network links major research areas in state • High performance computational resources distributed across state • Campus grids like UABgrid provide uniform access to computational resources • Regional grids like SURAgrid provide access to aggregate computational power and unique computational resources • Cyberinfrastructure enables new research paradigms throughout state
Alabama Regional Optical Network (RON) • Alabama RON is a very high bandwidth lambda network. Operated by SLR. • Connects major research institutions across state • Connects Alabama to National Lambda Rail and Internet2 • In collaboration with UA System, UA, and UAH
National LambdaRail (NLR) • Consortium of research universities and leading edge technology companies • Deploying national infrastructure for • advanced network research • next-generation, network-based applications • Supporting multiple, independent high speed links to research universities and centers
SURAgrid • Provides access to aggregate compute power across region
Alabama Grid? • Leverage Alabama's existing investments in Cyberinfrastructure • Need for dynamic access to a region infrastructure increasing • Need to build a common trust infrastructure • Benefit from shared and trusted identity management • Enable development of advanced workflows specific to regional research expertise
Future Directions • Begin pilot of a State grid linking UAB, ASA, and UAH resources?
Atlanta means Southern Light Rail – take out • Georgia Tech’s non-profit cooperative corporation • Provides access to NLR for 1/5 the cost of an NLR membership • Provides access to other network initiatives • Commodity Internet • Internet2 • NSF’s ETF – Atlanta Hub • Georgia Tech’s International Connectivity • Leverage Georgia Tech expertise and resources
Mission Statement of HPC Services • HPC Services is the division within the Infrastructure Services organization with a focus on HPC support for research and other HPC activities. • HPC Services represents the Office of Vice-President of Information Technology to IT-related academic campus committees, regional / national technology research organizations and/or committees as requested.
HPC Project Five Year Plan • Scope: Establish a UAB HPC data center, whose operations will be managed by IT Infrastructure and which will include additional machine room space designed for HPC and equipped with a new cluster. • The UAB HPC Data Center and HPC resource will be used by researchers throughout UAB, the UAS system, and other State of Alabama Universities and research entities in conjunction with the Alabama Supercomputer Authority. • Oversight of the UAB HPC resources will be provided by a committee made up of UAB Deans, Department Heads, Faculty, and the VPIT. • Daily administration of this shared resource will be provided by Infrastructure Services.
Preliminary Timeline • FY2007: Rename Academic Computing, HPCS, and merge HPCS with Network and Infrastructure, to leverage the HPC related talents, and resources of both organizations. • FY2007: Connect existing HPC Clusters to each other and 10Gig backbone. . • FY2007: Establish Pilot Grid Identity Management System – GridShib (HPCS, Network/Services) • FY2007: Enable Grid Meta Scheduling (HPCS, CIS, ETL) • FY2007: Establish Grid connectivity with SURA, UAS, and, ASA. • FY2008: Increase support staff as needed by reassigning legacy Mainframe technical resources • FY2008: Develop requirements for expansion or replacement of older HPC’s. xxxxTeraFlops. • FY2008: Using HPC requirements[1] (xxxx TeraFlops) for Data Center Design, begin design of • HPC Data Center. • FY2009: Secure Funding for new HPC Cluster xxxxTera Flops • FY2010: Complete HPC Data Center Infrastructure. • FY2010: Secure final funding for expansion or replacement of older HPC’s. • FY2011: Procure and deploy new HPC cluster. xxxxTeraFlops.