450 likes | 663 Views
Computing for Research: High Performance Computing, Data facilities, Drivers, work practices, Organisational journey. Bernard Pailthorpe, UQ bap@uq.edu.au. www.rcc.uq.edu.au vislab.uq.edu.au qcif.edu.au. Queensland Cyber-infrastructure.
E N D
Computing for Research:High Performance Computing,Data facilities, Drivers, work practices,Organisational journey Bernard Pailthorpe, UQ bap@uq.edu.au www.rcc.uq.edu.au vislab.uq.edu.au qcif.edu.au Queensland Cyber-infrastructure B. Pailthorpe, UQ - AHM, 28 Mar 2012
Outline • Our Heritage: USyd and UQ early players - lead up to RCC; eg. projects • nature of “the beast”: computing in universities • HPC infrastructure: Supercomputers, data, networks, vis - Computational Science: 3rd route to scientific discovery - Computational Engineering: design new products, services - Networks: more distributed, real-time computing - 10 Gbps connections - Scientific visualisation: the user interface • Policy & Funding drivers: APAC, NCRIS, EIF rounds • Questions for today: RCC Structure & Operations
From Harry Messel & Siliac (1955) at Usyd … to The Warren Centre Project, 1992 “Engineering the Future with High performance Computing (HPC)” - input to PMSEC (Dec’94) APAC (2000) NCRIS – NCI (2006) Cray YMP at U Sydney (ARC grant) Greg McRae (MIT) working with industry participants
Visualisation - users interacting with data & computers ARC + USyd & partners: 1992-2002 was www.vislab.usyd.edu.au/
Mathematics example:… as they do it Why vis? R. Bartnik, The Australian Mathematical Society Gazette, 31(3), 161-164, July 2004
Mathematics: concept illustration Diffeomorphism of space ~ mobius strip 720o rotation A well known party trick, which has on occasions been used to motivate the use of spinors in physics, is shown to have a counterpart in certain one-parameter families of diffeomorphisms of R3. One such family is constructed and then visualized as an animated sequence of deformations applied to parametric surfaces in R3. “Spinors and Entanglement”, Mathematica J.5(2) (Spring 1995) Andrew Norton & Gavin Brown Mathematics, UNSW (then USyd) was www.vislab.usyd.edu.au/ - now: www.rcc.uq.edu.au/gallery/diffeo/
Access Grid - ATP, Sydney: introduced to Australia the1st in Austr... participating in SC-Global confr. (Nov 2001): Beijing – Syd – US - UK ARC funding, 2001
History: Australian “Data Corridor”(APAC, c2006) 100 TB 150 TB 120 TB 100 TB 40 TB 45 TB JCU-AIMS . UQ/ QCIF . . . . SKA(iVEC) . ANU / APAC – now NCI Monash Over 4,000TB (4PB) Capacity > 500 TB then (Currently) installed 100+ TB p.a. upgrades in train – now 200+ Goal of long term archive capacity CSIRO-BoM
Research Data at UQ: 740 TeraBytes of data archived & online (Apr’10) - genomics, spectra & imaging - incl HPCU, QBI/AIBN, DI, Libr. 2 x StorageTek (Sun/Oracle) SL8500 robotic tape silos - dual sites: robust + 2 smaller silos - silos ~ 30 kW ea. - anticipate x2 capacity upgrade, pa 3,000 core supercomputer (25 TeraFlops) ~ 200-300 kW .. Planning 1 MW power upgrade 10 Gbps (10,000 Mbps) network B. Pailthorpe, UQ at DIISR RSDI mtg, 16 Sept 2010
Update: Research Data at UQ: - “Central” Growth: 234 (Apr’10) to 774 TB (Sept 2011) = 380 TB pa = double in 10 mo. 740 TeraBytes of data archived & online (4/’10) - genomics, spectra & imaging 2 x StorageTek (Sun/Oracle) SL8500 robotic tape silos - dual sites: robust + 2 smaller silos - silos ~ 30 kW ea. - anticipate x2 capacity upgrade, pa 3,000 core supercomputer (25 TeraFlops) ~ 200-300 kW .. Planning 1 MW power upgrade 10 Gbps (10,000 Mbps) network B. Pailthorpe, UQ RCC mtg, Mar2012
Scientific achievments – simulations: • Research projects supported at UQ - some early examples: long track record - emerging model. Some examples: B. Pailthorpe, QciF AHM, Aug’06; RCC AHM Mar’12
Bioinformatics:Rob Beiko & Mark Ragan (IMB, ARC Centre Bioinformatics, UQ)Lateral Gene Transfers in Prokaryotes LGT - organisms transmits DNA to non offspring common in bacteria develop resistance to antibiotics • BLAST alignment: track 423,000 predicted proteins from 144 organisms - 22k evolutionarily related - build phylogenetic trees for ea. Protein • Bayesian sampling: search the trees - probable pathways of gene sharing 'Highways of gene sharing in prokaryotes', Proc Natl Acad Sci USA102. 14332-7 (2005). B. Pailthorpe, QciF AHM, Aug’06
Chem Eng:Suresh Batia (UQ)Transport and Adsorption in Nanomaterials Pores diam ~ 5 - 50 ångström, molec ~3-5 Å Monte Carlo & molecular dynamics (MD) simulations Molecular sieving in microporous materials - method of separation many industrial applications: separations, catalytic processing. Zeolite rho structure Phys. Rev. Lett., 91, 0126102 (2003); 95, 245901 (2005). B. Pailthorpe, QciF, AHM, Aug’06
Chemistry:Seth Olsen & Sean Smith(CCMS, UQ)Computational Modeling of Red Fluorescent Proteins (RFPs) Chromophore: brighter fluorescence (than GFP) Gaussian03, NWChem calc. - excitation energies Deep-tissue biomedical imaging in cell cultures - visualization of biological processes B. Pailthorpe, QciF AHM, Aug’06 Chem. Phys. Lett.420, 507 (2006).
Density profile Physics:Chao Feng & Mike Malone(Centre for Quantum-Atom Optics & RCC, UQ, 2012) A simulation of a Bose Einstein Condensate (BEC), showing the density of a cylindrically symmetric BEC after a laser beam is dragged through the condensate. Computational grid: 64 (r) x1024 (z); 200 timesteps. Visualised in vtk. Movie at: www.rcc.uq.edu.au/gallery/bec/pto:: B. Pailthorpe RCC AHM, Mar’12 ref., xx
Physics:Chao Feng & Mike Malone(Centre for Quantum-Atom Optics & RCC, UQ, 2012) Movie at: www.rcc.uq.edu.au/gallery/bec/
Ecology: new user community, ARC Centre Benefit / Cost Figure 1: Acacia harpophylla — in need of protection. This vegetation type is down to less than 15% of its original extent, and is an example of habitat that would benefit from the scheme of Fuller et al. - Peter Kareiva, “Trade in to trade up”, in News & Views, p322. Replacing underperforming protected areas achieves better conservation outcomes. Richard A. Fuller, Eve McDonald-Madden, Kerrie A. Wilson, Josie Carwardine, Hedley S. Grantham, James E. M. Watson, Carissa J. Klein, David C. Green & Hugh P. Possingham. Nature 466, 365-367 (15 July 2010).
Ecology: new user community (2) TERN “We are interested in following topics : => visualising sparse uneven data distributed across both spatial and temporal dimension. => visualising heterogeneous data with inconsistent temporal and spatial interval at multiple dimension. => visualising different ecosystem domain datasets from past to future at different gradients, elevations and depths. …. I am happy to discuss if you require.” Dr. S.M.Guru
Collaborative working:An ARC e-Research projectB Pailthorpe, Chris Willing, N Bordes(UQ), I Atkinson(JCU) Currently > 30 AG nodes in Australia, 10 in Qld, and > 200 worldwide Beyond sharing ppt for meetings … molecular viewers (chem, bio) Geographic Information Systems (GIS) for urban planning, logistics, mapping, resources, response, …. Shared Grass (GIS software app), within AG environment Stéphane Bidet, VisLab UQ + Downloads at: http://www.vislab.uq.edu.au/research/accessgrid/software/ shared workrooms B. Pailthorpe, QciF, AHM, Aug’06
Smart Astronomy: Virtual ObservatoryMichael Drinkwater (Physics), Marcus Gallagher (ITEE) An ARC e-Research project • The Virtual Observatory • Automated, scalable management and analysis of large astronomical catalogues. • Use advanced data mining tools to process and analyse distributed data archives of the world’s observatories. Current UQ work • A new method for catalogue matching • Web-based front-end and underlying software to perform automated matching of catalogues using both conventional methods and machine learning algorithms. http://drexler.physics.uq.edu.au/uqvo/ • key contributor to the Australian Virtual Observatory • Future application to SkyMapper telescope which will produce huge groundbreaking datasets. A key example problem Record linkage: matching radio galaxies to their optical counterparts using machine learning/data mining algorithms. A Data Grid ausVO iVO UQ Team: M Drinkwater, M Gallagher,K Pimbblet, Alejandro Dubrovsky, David Rohde, B Pailthorpe. Collaborators: E Sadler (USyd), P Francis (ANU). B. Pailthorpe, QciF HM, Aug’06
An ARC e-Research project Human society:e-archaeologyNicole Bordes, Sean Ulm (UQ) Goal: development and implementation of an Australian archaeological digital collection platform based on existing HPC techniques & infrastructure (SRB..). digital collections: facilitate the dissemination and interchange of archaeological data (maps, GIS, satellite images, photos, audio, artefacts): across disciplines & institutions across public & private sectors; enhance the ability of archaeological research to reach its full potential; contributes to discourses about Australian history, cultural heritage & identity. Test case: Mill Point Archaeological Project, SE Qld. (NW of Noosa) 3rd year of funding currently supports 2 PhD students + 1 Tech will be the model for future digital collections. Participants: UQ - VisLab/SPS, Soc. Sci. ANU - ANUSF SDSC B. Pailthorpe, QciF AHM, Aug’06
Mill Point Archaeological Project Photos N. Bordes & S. Ulm B. Pailthorpe, QciF, Aug24, 06
Porting scientific apps to the OptIPortal – demo at QuestNet-09 iCluster - IMB, UQ Classify 100s-1000s of cell images on the fly, with humans in the loop Also Paraview / vtk Mayavi/ vtk .. for general purpose Sci Vis Run “natively” in OptIPortal … breaking out of desktop limitations i -Cluster: Nick Hamilton (IMB), R Hammond, Chris Willing & B Pailthorpe (UQ Vislab) B. Pailthorpe, UQ at SAGE BoF, SC-11
Genome Browser (UCSC) in the OptIPortal – demo xx - IMB, UQ Mike Pheasant Run “natively” in OptIPortal … breaking out of desktop limitations B. Pailthorpe, UQ at SAGE BoF, SC-11
Computing at UQ… or at many unis, for that matter • “corporate” or business computing - for Org’n: finance, HR, web, email, library & scholarly holdings - for T&L, students: also web, email, class materials (eg. Blackboard), records Established platforms (IBM, Microsoft, SAS …); often standardised, centralised vs. • computing for researchers - high performance, costly, power hungry, rapidly evolving (Top500 etc), specialised, skills intensive; - coupled to research - algorithms, software; often unix, linux, OSS; data growing rapidly Good models: Princeton, Purdue (cf. Educause case studies, 2005-6) + Computational Science - recognised as a third route to scientific discovery; + Computational Engineering - design new products, services (industry)
Ecology of Computing Options… use any or all • National (peak) facility - NCI, at ANU: nci.org.au; Merit allocations: Usyd researchers most successful nb. Agency shares rising - ARC LIEF share in 2012- • State “shoulder” facilities (QCIF, VPAC, iVEC, …. - focus on industry or special purpose - also U Melb, Monash • Own facility (ANU, UQ, …. - reflects institutional priorities + local clusters (chem, phys, eng …) - needs lifecycle costing + condor “flocks”, etc - harvest unused workstation cycles & maybe commercial or research “clouds” ?
UQ HPC infrastructure (with QCIF): Supercomputers (Rackable/ sgi), Data 25 kW Sun/Oracle StorageTek SL8500 (2009) 30 kW x many @UQ: 3,400 TB tapes 400 TB discs I/O (32) servers new (2007-10) New: 3,000 cores, 25 TeraFlops Tape archives (StorageTek tape robots) Power: upgrade ? 3-yr ops cost = equip cost (2000-08)
2007 Usage breakdown UQ- QCIF supercomputers • Chem. = 37% - decline from 64% in 2006 Eng. = 24% Bio = 19% • Skills limited: - 80% use 1 processor - 10% use 4-8 proc. - 2.4 % use 16 proc. * Need to learn parallel programming
Getting better in 2009 - usage of QCIF supercomputers • AIBN (bio-eng) = 49% - other Eng. = 24% Econ = 16%, Bio = 3% • betterperf’m / skills: - 41% use 1 processor ✓44% use 10 proc. + 3 % use 16 proc. + 2 % use 32 proc. … so learning parallel programming !
Usage at UQ: Nb. QBI-CAI /AIBN maintainown facilities Library holds ~ 5 TB (Apr’10)
Supercomputer Ops status http://hpcu.uq.edu.au/ganglia/ - via rcc.uq front page
Collaboration tech: AG usage www.rcc.uq.edu.au/accessgrid/usage/btest.py at Asia Pacific AG venue server Also desktop systems: Skype EVO
AG - impact in Qld. “.. in 2006 AG saved Qld. Universities $2.2m in avoided travel costs … .. and increased research productivity by 40,000 hours.. “ (time saved) Est’m CO2emissions avoided = 1,300 tonnes in 2006, for Qld. & guesstimate ~ 5,000 t. nationally Ref. //flights.carbonplanet.com/ -> 0.3 kg CO2/ passenger-km
Data intensive Compute Genomics, Ecol, Soc Sci. Eng, Sciences: Soc Soc Econ IMB bioinfm TERN ecol Imaging Sci DI TRI Eng SMI AIC AIBN QBI CMM Research Users Earth Sci UQ RCC concept diagram bap early draft (2010) R&D programs R&D liaison User Support, Services Training & Education Skills Algorithms, Codes Software packages software Infrastructure Supercomputers Data Data Data Clusters hardware network
UQ’s Research Computing Centre Local Drivers: • Significant infr. Investment • Dispersed tech & support staff – coherence • Organisational home for activities + state, national & int’l engagement • User Support needed • Skills development To provide: • Management and support of UQ infrastructure: supercomputers & software, tera-scale data archives, visualisation & collaboration services; and 10G network connections; • Research user support. Features: • research engaged & research led, • usable, • delivers value to researchers • core Tech staff • “Liaison” staff to major research units Aim: Enable Research Discoveries
as a university level Centre, Reporting to DVC,R RCC approved by UQ Senate, Aug 2011. now underway
NCRIS / EIF developments – rolling out in 2012; e-Research programs • RDSI ($50m - UQ) rdsi.org.au - for data storage; • NeCTAR ($47m - U Melb) nectar.org.au - 1st round announced - for Virtual Labs; Research Clouds - Genomics (UQ, Umelb + ); - Characterisation (Monash, UQ, ANU, USyd, Synchrotron, ANSTO) + Matching: Operations, Facilities (machines rooms, power, etc) Rob Cook, QCIF to brief us … soon
RCC establishment – rolling out in 2012; • Staff: approx 20 EFT - ITS HPC Unit (& some EBI Mirror); Vislab group, R&D L, e-R staff (qcif); + student interns; • Space – tbd. • Budget: follows UQ model: 50% Central 33% Faculty / Institute Subscriptions 5%, rising to 15% user charges ~12% ext. (as available – here QCIF)
Training – ongoing + in development; • HPC hands-on Intro - David Green, et al: Next session: Friday 30 Mar, 1-5pm; • Cosc3000 – SciVis (Semester I) • Cosc3500 – HPC & Parallel Programming (Semester II) = students looking for projects • Other classes: rework as ½ - 1 day workshops.
Purpose of the meeting Seek your input: plan the next 3 years for RCC Researchers needs, priorities Faculty/ Institute engagement User support: what you need, then how? Advisory Board meets tomorrow: Management Committee at UQ – RCC Ops R&D Liaison positions: what, where? …. Challenges: space, budget, engagement thank you for your inputs
Data Centres - using containers Facilities: Microsoft, Chicago “Container canyon” (56 containers, 700k sq.ft, 30 MW site - $500m ) Modular Contained More energy efficient Robust Denser Google (2005) 45 containers 10 MW PEU 1.25
New entrants in the market: “Cloud” services “Virtualised”, distributed resources {Need to over-provision capacity for peak demand - could sell capacity to “passing trade” } • Amazon S3: ( Simple Storage Service3) http://aws.amazon.com/s3/#pricing price: US 15c/GB/mo (< 50 TB) = US $1.8k / TB pa - hosts 29 bn files (Nov’08); at Boardman, Oregon (Columbia River, hydro power) • Amazon: Elastic Block Store (EBS) http://aws.amazon.com/ebs/ ~ 1 GB – 1 TB; price: US 10C/GB/mo + I/O -aimed at web-site hosting ~ 100 GB - cf. Amazon “CloudWatch” services (monitoring) • Google Docs http://googleenterprise.blogspot.com/2010/01/store-and-share-files-in-cloud-with.html “free” 1 GB/user; files < 250 MB; then US$ 0.25/GB pa • Google apps http://code.google.com/apis/storage/docs/overview.html#pricing - for codes; US$ 0.12 /GB/mo + in/out = US$ 0.12 /GB & Google apps premier: US$3.5 /GB pa • Rackspace“Cloud Files” www.rackspace.com US 15c/GB/mo. +8c in +22c out / GB = US $5.4k / TB pa • NSF/ TACC study of costs: own vs. Lease • LBNL study of performance ~ “broad”, rather than deep, use NSF quote – for research data: “US$ 1k /TB pa”
Amazon elastic cloud (EC2) quotes: 15.6c/hr (large mem [8 GB] / 1 yr contract) • (http://aws.amazon.com/ec2/#pricing)
DIY: 1-2 TByte disc drives at home 8K scientific animation www.shopbot.com.au (retail prices at June, 2010) Dick Smith: www.dse.com.au Western Digital wdc.com Lake Forest, CA. Q’s Will it last ?? MTBF: 50k hrs (5.7 yr) I/O bandwidth PC bus: 230 Mbps Software interface Scalability Digital rot MTBF: manufacturers quote: “600k – 1.5m” hrs (> 100 yr) –> Annualised Failure Rate: < 1%” : CMU (B Schroeder & G A Gibson, 2007) finds 2- 4-13% x 15