90 likes | 213 Views
Science, Engineering, Technology… (and the Facilities that Support them). San Diego Supercomputer Center University of California, San Diego Net@EDU Annual Meeting February 5, 2007 Dallas Thornton IT Director, SDSC. SDSC in a nutshell. Grid and Cluster Computing.
E N D
Science, Engineering, Technology…(and the Facilities that Support them) San Diego Supercomputer Center University of California, San Diego Net@EDU Annual Meeting February 5, 2007 Dallas Thornton IT Director, SDSC
SDSC in a nutshell Grid andClusterComputing • Employs nearly 400 researchers, staff and students • UCSD Organized Research Unit • Strategic Focus on Data-Oriented Scientific Computing • Home of many associated activities including • Geosciences Network (GEON) • Network for Earthquake Engineering Simulation IT (NEESit) • Protein Data Bank (PDB) • Joint Center for Structural Genomics • Alliance for Cell Signaling (AfCS) • Biomedical Informatics Research Network (BIRN) Coordinating Center • High Performance Wireless Research and Education Network (HPWREN) High-end computing Data andKnowledge Systems Networking Integrated Biosciences Integrated Computational Sciences
A Partial List of Databases and Data Collections currently housed at SDSC • Protein Data Bank (protein data) • National Virtual Observatory (astronomical data) • UCSD Libraries Image Collegion(ArtStore) • National Science Digital Library (education collection) • SCEC (earthquake data) • BIRN (neuroscience data) • Encyclopedia of Life (genomic data) • Protein Kinase Resource (protein data) • TreeBase (phylogeny and ontology information) • Transport Classification Database (protein information) • PlantsP (plant kinase information) • PlantsT (plant transporter information) • PlantsUBQ (plant protein information) • CKAAPS (protein evolutionary information) • AfCS Molecule Pages (protein information) • SLACC-JCSG (structural genomics data) • APOPTOSIS DB (proteins related to cell death data) • NAVDAT (geochemistry data) • QRC (NSF data on Supercomputer Centers and PACI) • Network Topology Data (Skitter project) • Biology Workbench Databases (mirrors and “originals” of over 80 biology databases) • San Diego and Tijuana Watersheds(water resources mapping) • 2Micron All Sky Survey (astronomy data) • Digital Palomar Observatory Sky Survey Collection(astronomy data) • Sloan Digital Sky Survey Collection (astronomy data) • Interpro Mirror (protein data) • HPWREN Wireless Network Network Analysis Data • HPWREN Sensor Network Data • Security logs and archives (security information) • Nobel Foundation Mirror(information) • EarthRef Digital Archive (Earth Science information) • GERM (earth reservoir information) • PMAG (paleomagnetic information) • GEOROC (petrological and geochemical data for igneous rocks) • Kd’s DB (rocks and minerals) • Braindata (Rutgers neuroscience collection) • LTER (hyperspectral images) • SIO-Explorer (oceanographic voyages) • Scripps (oceanographic research data) • Transana (classroom video) • WebBase (web crawls) • Alexandria Digital Library(photographs) • Backskatter Data (from UCSD network telescope) • Digital Earth Data Library (earth sciences related datasets) • PETDB (petrological and chemical data) • Seamount Catalogue (bathymetric seamount maps) • IPBIR(primate information) • Hayden Planetarium Collection (astronomical data) • TeraGrid Data (science and engineering collections) • Digital Embryo (human embryology) • National Archives (persistent archive) • San Diego Conservation Resources Network (sensitive species map server) • Bionome (Biology network of modeling efforts) • KNB (Knowledge networks for biocomplexity) • LDAS (land data assimilation system) • SEEK (ecology data) • ROADNET (sensor data) • NPACI Data Grid (scientific simulation output) • Salk (biology data archive) • CUAHSI (community hydrological collection) • Backbone Packet Header Traces (OC48, OC12)
SDSC’s Funding • Federal Grants • State Support • Campus Support • Industry Partnerships • Recharge / Fee For Service • Leverage Economies of Scale • Labor – Consulting, Support, Sys Management, etc. • Storage • Compute Cycles • Collocation/Hosting Services
SDSC’s Evolutionary Datacenter • Privately-built 7,000 sq ft. in 1985 • Transitioned to UCSD in 1997 • Expanded to 11,000 sq. ft. in 2001 • Expanded to 14,000 sq. ft. in 2006 • Expanding to 19,000 sq. ft. in 2008 • Power and Cooling Requirements Grew and Changed with New Systems • Previous upgrades have been costly. • Developing a scalable power and cooling infrastructure with UCSD facilities to accommodate future systems.
Lessons Learned (or Learning) • Maximize yield from the build and upgrades • Incremental upgrades are exceedingly expensive! • Engineer the facility for 2x-4x power, cooling, and space expansion capability... (No matter what the architects say.) • Decide where to invest your money • 2N configurations, UPSes, Generators, etc. are great but usually too expensive to be worthwhile for large research clusters. • Evaluate systems in need of this reliability and build accordingly. • Consider different rates for this extra level of service. • Be on the same page with campus facilities • Ensure newly-installed distribution paths provide spare capacity. • Carefully evaluate utilities costs in site selection. • Standardize, standardize, standardize!
The Density Problem Note Log Scale HPC Even More Dense 10kW Racks in 2005 will be 100kW in 2010 Rising Density + Reduced Costs = Exponential Demand Growth
Who pays for the facilities? • PIs / Faculty • What do my indirect costs pay for, anyways? • This varies widely by institution, but IDCs do not scale well with the facilities requirements of machines over time. • Need to budget incremental facilities costs in grants. • Grantors • Facilities should be funded by the state. • As the costs to operate and maintain increasingly facilities-hungry systems increase, states are less capable of providing adequate support. • Need to support incremental facilities costs in grants. • Campuses/States • The grantor should pay the costs of the grant’s needs. • A valid argument, but if the state/campus wants to be competitive with their proposal, some subsidy is required. • Need to develop a scalable model to incrementally fund facilities, decide how much this will be subsidized, and get buy-in from PIs and Faculty.