280 likes | 488 Views
Computing for Research: simulation, data and visualisation. Bernard Pailthorpe Research Computing Centre, UQ bap@uq.edu.au www.rcc.uq.edu.au. B. Pailthorpe, at UQ USMC, 16 May 2012. Outline. www.rcc.uq.edu.au. • Computer Simulation & Visualisation
Computing for Research:simulation, data and visualisation Bernard Pailthorpe Research Computing Centre, UQ bap@uq.edu.au www.rcc.uq.edu.au B. Pailthorpe, at UQ USMC, 16 May 2012
Outline www.rcc.uq.edu.au • Computer Simulation & Visualisation Examples of projects supported at UQ & research impact - Computational Science: 3rd route to scientific discovery (also Computational Engineering: design new products, services) • HPC infrastructure: Supercomputers, data, networks, vis • Nature of computing in universities -> RCC • QCIF & national partners: APAC, NCRIS, EIF rounds …
Chemistry:Seth Olsen & Sean Smith(CCMS & AIBN, UQ)Computational Modeling of Red Fluorescent Proteins (RFPs) Chromophore: brighter fluorescence (than GFP) Gaussian03, NWChem calc. - excitation energies Deep-tissue biomedical imaging in cell cultures - visualization of biological processes Chem. Phys. Lett.420, 507 (2006).
Simulation of a Bose Einstein Condensate (BEC), showing - density of a cylindrically symmetric BEC after a laser beam excited the condensate. Visualised in vtk, Paraview Physics - BEC:Chao Feng & Mike Malone... working at the Optiportal display Wall Movie at: www.rcc.uq.edu.au/gallery/bec/
Chem Eng:Suresh Batia (UQ) Transport and Adsorption in Nanomaterials Pores diam ~ 5 - 50 ångström, molec ~3-5 Å Monte Carlo & molecular dynamics (MD) simulations Molecular sieving in microporous materials - method of separation many industrial applications: separations, catalytic processing. Zeolite rho structure Phys. Rev. Lett., 91, 0126102 (2003); 95, 245901 (2005).
Hypersonic flight:Multi-Disciplinary Design Optimization CFD pressure contours in the inlet and combustion chamber Russel Boyce & team (Mech Eng, UQ) Ogawa, H. and Boyce, Russell R. Proc.. ICFD2010, Sendai, Japan,. Nov. 2010.
Medical Imaging:Improving health care - 3D MRI screening for breast cancer; • • visualise 4D dynamic contrast-enhanced (DCE) MRI • - used in the routine clinical setting (10-15 min , movement..) • • identify regions of suspicious enhancement • - greater accuracy & speed (cf. 1hr on PC) • - model-fitting for ea. spatial slice Current practice - manual review of the raw data & derived subtraction images. * Large quantity of high dim. data - difficult for clinicians to review - cost of missed or incorrect detection is high Injected contrast dye uptake Prof. Stuart Crozier & Dr. Andrew Maenart, ITEE, UQ + Dr Kerry McMahon, Qld. X-Ray
Bioinformatics:Rob Beiko & Mark Ragan(IMB, ARC Centre Bioinformatics, UQ) Lateral Gene Transfers in Prokaryotes - organisms transmits DNA to non offspring common in bacteria develop resistance to antibiotics • BLAST alignment: track 423,000 predicted proteins from 144 organisms; - 22k evolutionarily related - build phylogenetic trees for each Protein • Bayesian sampling: search the trees - probable pathways of gene sharing “.. thousand-dollar genome and million-dollar interpretation…” Nature, 483, 21 (1 March 2012). 'Highways of gene sharing in prokaryotes', Proc Natl Acad Sci USA102, 14332-7 (2005).
Aligning Genomes … D. R. Zerbino, B. Paten, David Haussler. Review article: Science, 336, 179 (13 Apr 2012)
… to understand molecular processes& disease D. R. Zerbino, B. Paten, David Haussler. Review article: Science, 336, 179 (13 Apr 2012)
Ecology: new user community, ARC Centre Benefit / Cost Figure 1: Acacia harpophylla — in need of protection. This vegetation type is down to less than 15% of its original extent, and is an example of habitat that would benefit from the scheme of Fuller et al. - Peter Kareiva, “Trade in to trade up”, in News & Views, p322. Replacing underperforming protected areas achieves better conservation outcomes. Richard A. Fuller, Eve McDonald-Madden, Kerrie A. Wilson, Josie Carwardine, Hedley S. Grantham, James E. M. Watson, Carissa J. Klein, David C. Green & Hugh P. Possingham. Nature 466, 365-367 (15 July 2010).
Ecology: new user community (2)TERN (Stuart Phinn, et al) “We are interested in following topics : => visualising sparse uneven data distributed across both spatial and temporal dimension. => visualising heterogeneous data with inconsistent temporal and spatial interval at multiple dimension. => visualising different ecosystem domain datasets from past to future at different gradients, elevations and depths. …. I am happy to discuss if you require.” Dr. S.M. Guru, TERN (21/3/12)
Mill Point Archaeological Project Photos Nicole Bordes & Sean Ulm B. Pailthorpe, QciF, Aug24, 06
3D interactive model of the archaeological dig • • digital collections: dissemination and • interchange of archaeological data • (maps, GIS, satellite images, photos, audio, • artefacts): • across disciplines & institutions, • across public & private sectors; • enable archaeological research to reach its • full potential; • contributes to discourses about Australian • history, cultural heritage & identity. Bordes, N., S. Ulm, O. Pettersen, K. Murphy, D. Gwynne, W. Pagnon, S. Hungerford, P. Hiscock, J. Hall and B. Pailthorpe. “Data grid for the management, reconstruction, analysis and visualisation of archaeological data”. in S. Ulm and I. Lilley (eds), An Archaeological Life: Papers in Honour of Jay Hall, pp.251-264 (UQ, 2006). An ARC e-Research project (2005). www.rcc.uq.edu.au/vislab/archaeology/
Computational Linguistics: digital scholarship Analysis of 5.2M books: a “cultural genome”; 4% of books published (1500’s –now) x1000 longer than human genome + TED talk Extinction -> “Tens of thousands of books appear in this photograph of the interior of the sculpture Idiom, by Matej Krén; .. in the Municipal Library of Prague.” … E Lieberman et al “Quantifying the evolutionary dynamics of language” Nature, 449, 713 (11 Oct 2007). J-B Michel, et al. “Quantitative Analysis of Culture Using Millions of Digitized Books”, Science, 331, 176 (14 Jan 2011). www.ted.com/speakers/jean_baptiste_michel.html(June 2011) M Pagel et al “Frequency of word-use predicts rates of lexical evolution throughout Indo-European history” Nature, 449, 717 (2007). “culturomics”
High Impact - scientific discovery Science cover, Nov 10, 2000. •Rosenfeld lab - cellular … molecular imaging mouse pituitary gland Allosteric effects of Pit-1 DNA sites on long-term repression of cell type specification ~ gene activation AND repression - collaboration of 3 national centers (NSF, NIH) Kathleen Sculley, Jim Feramisco, Mark Rosenfeld, et al, UCSD Cancer Center + bap at SDSC / NPACI - volume visualization tools
UQ HPC infrastructure (with QCIF funding): Supercomputers (Rackable/ sgi), Data 25 kW Sun/Oracle StorageTek SL8500 (2009) 30 kW x many @UQ: 3,400 TB tapes 400 TB discs I/O (32) servers new (2007-10) 2010: 3,000 cores, 25 TeraFlops Data archives: StorageTek tape robots + new 500 TB disc Power: upgrade ? 3-yr ops cost = equip cost (2000-08)
Research Data at UQ: 740 TeraBytes of research data (at Apr’10) - genomics, spectra & imaging - incl HPCU-ITS, QBI/AIBN, DI, Library - “Central” component was 234 TB - Library was 5 TB 2 x StorageTek (Sun/Oracle) SL8500 robotic tape silos - dual sites: robust + 2 smaller silos - power ~ 25 kW ea. - anticipate x2 capacity upgrade, pa 3,000 core supercomputer (25 TeraFlops) ~ 100 kW .. needs 500 kW power upgrade 10 Gbps (10,000 Mbps) network B. Pailthorpe, UQ at DIISR RSDI mtg, 16 Sept 2010
Update: Research Data at UQ – growing rapidly: Genomic data is 73% & growing at 56% pa - “Central” Growth: 234 TB (April 2010) to 774 TB (Sept 2011) = 380 TB pa = doubles in 10 mo. - Other sites not quantified as yet 740 TeraBytes of data archived online (4/’10) - genomics, spectra & imaging 2 x StorageTek (Sun/Oracle) SL8500 robotic tape silos - dual sites: robust + 2 smaller silos - anticipate x2 capacity upgrade, pa 3,000 core supercomputer (25 TeraFlops) 10 Gbps (10,000 Mbps) network B. Pailthorpe, UQ RCC mtg, Mar 2012
Computing at UQ… or at many unis, for that matter • “corporate” or business computing - for Org’n: finance, HR, web, email, library & scholarly holdings - for T&L, students: also web, email, class materials (eg. Blackboard), records Established platforms (IBM, Microsoft, SAS …); often standardised, centralised vs. • computing for researchers - high performance, costly, power hungry, rapidly evolving (Top500 etc), specialised, skills intensive; - coupled to research - algorithms, software; often unix, linux, OSS; data growing rapidly Good models: Princeton, Purdue (cf. Educause case studies, 2005-6) + Computational Science - recognised as a third route to scientific discovery; + Computational Engineering - design new products, services (industry)
as a university level Centre, Reporting to DVC,R RCC approved by UQ Senate, Aug 2011. Aim: Enable Research Discoveries Features: • research engaged & research led, • usable, • delivers value to researchers • core Tech staff • “Liaison” staff to major research units … now underway
UQ RCC concept diagram ? AIBN, SCI, etc. Computing intensive Research Users EAIT Chem & Ecology TERN SMI Support User HPC Earth Sci Core Tech Staff (HPC Unit etc - DG) Data intensive Admin Management Sci (ACE) QBI CMM Genomics IMB QBI Imaging CAI EBI Mirror UQDI TRI, UQCCR ? ISSR,SBS Econ, Arts QCloud Data Storage and e-Research Analysts (QCIF)
Queensland Cyber Infrastructure Foundation Members: National Partners: AAF Access, entitlements ANDS Metadata, data access RDSI Large-scale data storage NeCTAR Cloud computing & data AURIN Urban data NCI Tier 1 HPC (QCIF share) AARNet, NRN Networking Supported by: Queensland Government State eResearch Partners: Associate: • Not-for-profit company since 2000 (then QPSF) • High performance computing - research infrastructure and services
QCIF High-Performance Infrastructure & Services Shared HPC infrastructure Hosted Services Applications Tools, Data eResearch Capability HPC Data storage Research Industry Cloud services User Support Uptake Expertise Skills development Facilitating collaboration Rob Cook: QCloud Brief, 30Apr2012
NCRIS / EIF Developments: Ongoing; rolling out in 2012; • APAC: PMSEC (Dec’94) –> APAC (2000-2006); & NCRIS 5.16 (Platforms for Collaboration, PfC) • RDSI ($50m – UQ lead) www.rdsi.org.au - for data storage nodes … in process • NeCTAR ($47m - U Melb) www.nectar.org.au - 1st round announced: - for Virtual Labs; Research Clouds - Genomics (UQ, UMelb + ); - Characterisation (Monash, UQ, ANU, USyd, Synchrotron, ANSTO) + matching Smart State funds (via QCIF).
QCIF/RCC developments underway: QCLOUD Upgrade: Data & Compute facilities - at UQ NeCTAR: 4000 CORE “RESEARCH CLOUD” Nationally 4-6 nodes; 24,000 cores Nationally 4-6 nodes; 200 PB RDSI: 30PB DATASTORE Q.e-R Node EARLY CLOUD +500 TB disc already installed GENOMICS CLOUD Rob Cook: at RCC All hands, 28Mar2012
IMPACT: grants into UQ, publications; Qld. econ. • $65m of ARC + NH&MRC grants to UQ in past 2 years by Top20 RCC Users (Groups), at Apr 2012. • Qld. Govt. $16m investment (2000-06) -> GSP: + $12.7m pa x 10 yrs NPV: $160m (discounted; to 2015) Grants: $ 62m (fraction apportioned) Employment: +50 jobs in Qld. Publications: 344 (+ 165 Confr. Proc.) PhD students: 121 Research Productivity: +40,000 hr. Cost savings (as avoided costs): - travel: $2m pa (2006) - health care: $8m pa Report (2006) available at: www.rcc.uq.edu.au/about-rcc
Summary • UQ is a strong player in HPC & Data for Research, historically = top 3, 4 nationally • Computation & Data central to much research at UQ - research outputs, Grants - chem, phys, eng; genomics, ecology, econ & emerging in Social Sci & Humanities - large Demand; rapidly growing; - funding secured. • Opportunity to continue ….