310 likes | 428 Views
A user-centric vision for future eInfrastructure and services in Norway. eSOP seminar on eInfrastructure Use Roadmap, March 11, 2011. Hans A. Eide, PhD Group leader Research Computing Services USIT, University of Oslo. University of Oslo and IT, research, HPC. Two-tier IT organization:
E N D
A user-centric vision for future eInfrastructure and services in Norway eSOP seminar on eInfrastructure Use Roadmap, March 11, 2011 Hans A. Eide, PhD Group leader Research Computing Services USIT, University of Oslo
University of Oslo and IT, research, HPC • Two-tier IT organization: • Local: (at institutes / faculties) • Central: University Center for Information Technology (USIT) • USIT • 240+ FTE and growing • Covers all aspects of University IT activities • Section for Education and Research Support (SUF) • Provides resources, tools, support, competence for the primary production (education and research), 40 FTE • Research Computing Services (VD – the HPC group) • Research support, competence, operations
Research Computing Services group • 14 people, 9 with research background (Ph.D) • “buffer” between advanced resources and researchers • Advanced user support (e.g. parallelization, grid enabling) • Computation, storage, visualization, emerging tech. • Not limited to “hard sciences” or HPC • Multi-source funding • RCN (Notur, NorStore, Norgrid, projects) • Research projects (life sci., astro, physics, etc.) • UiO • Training, support, operations, help-desk
Tomorrow’s eInfrastructure and services …and the answer is
Tomorrow’s eInfrastructure and services • Must support all fields of research, be accessible • Help maximize science production, to the benefit of society (social, economic, ..), while • minimizing TCO (i.e. be effective) • Environmentally friendly • Quickly adapt to technology changes and new demands to give competitive edge • Maintained at a sufficient and stable level relative to use/need That’s fine, but how to do it?
eInfrastructure really should mean the whole package, but Usually divided in two aspects: • eInfrastructure • Hardwarei.e. computing resources, storage, network, … • Services • Software • Brainware (support services)
The eInfrastructure pyramid (anno 2011) Capacity Capability Multi-Petaflop WLCG PRACE Greenest users users Petaflop NGIs Nordic? Greener users users users users Sub-Petaflop clouds Green(?) Development Competence Services Support Training Portals Tools Databases Data sources
Today’s situation (simplified) for computing and storage UiO UiB NTNU UiT Basic infrastructure (network)
Today’s situation (simplified) for computing and storage End of 2010 300kW (maxed out) From 2011 900kW (sufficient to 2013+) Limited space (and cooling) UiO UiB NTNU UiT Basic infrastructure (network)
Alternative 1: go alone (x MW in 2015) Green datacenter + UiO UiO
Alternative 2: together (y MW in 2015+) Green datacenter Green datacenter Green datacenter + UiO UiB NTNU UiT
Alternative 3 (2020!) Green datacenter UiO UiO Green datacenter UiO U of X “Life science” U of Y UiO “Language technology” UiO UiO UiO UiO UiO UiO U of X UiO UiO UiO UiO UiO “Particle physics” UiO Green datacenter U of Y Green datacenter “Climate” 15
Ideal eInfrastructure services: • National core services together with local services • Fully financed, permanent positions • Close to local resources, users • Pool of competence (advanced user support) • Training, courses, outreach, marketing • Technology watch, early adopters • Partake in Nordic/EU/world-wide programs • Members who are experienced with ICT in the research process (have background as researchers)
2010 1946 1820 1968 1991 Mechanical calculator Towards the computer A tool for many A tool for “all” Data systems everywhere The four waves of extraordinary growth in use of ICT Advanced services and infrastructures Research and development Internet applications PC (affordable) Mainframe computers Number of users (inverse of skills needed by users)
The evolution of the HPC computing pyramid (William Gropp, UIUC) 1993 2029 Tera Flop Class www.zettaflops.org Center Exascale Supercomputers Center Supercomputers Single Cabinet Petascale Systems (or attack of the killer GPU successors) Mid-Range Parallel Processors and Networked Workstations Laptops, phones, wristwatches, eye glasses… High Performance Workstations 19 25.10.2014 Users needed to be “inside the box” Users “outside the box”
Tomorrow’s today’s (average) user • Knows little (nothing) about HPC (and have no interest in it either) • Most can’t program (at least not good) • Don’t want to spend time learning something if it can be avoided • Just want results and move on • Doesn’t know what is available • ..but expects to get services, resources, and support for free
SUIT 2010 – Research support 12) Bruker du, eller kjenner du til følgende tjenester fra USIT? Bruker / Har brukt / Kjenner til / Kjenner ikke til / Ikke aktuelt
Challenges • Even HPC for dummies is too advanced(and why should users bother?) • Knowledge about basic methodology seem to be declining in all fields, among students and researchers alike (e.g. statistics, mathematics) • Hard to reach the “customers” with passive marketing (i.e. web-pages) • Late adopters of new technologies/capabilities (“don’t ask me what I need, you should tell me what I need”) • Serial jobs (not necessarily embarrassingly parallel)
(Some) solutions • Make it simple to useF.ex. computing portals (can mitigate problem of serial jobs by e.g. using GPUs w/o user even knowing!) • Emphasis on using ICT methods and eInfrastructure in the education program – part of the curriculum! • Tailored courses and training for user groups • Forward-leaning marketing of services (e.g. approach and ask “why are you not using our xyz service in your research?”) • Advanced support (enter early in the problem formulation/design process), competence
Example: Bioportal • 2659 registered users, 700+ active • 40+ applications (MrBayes, RaXML, BLAST, Paup, structure, R, BEAST og PhyML, …) • Bio (life science), chemistry, statistics • Tailored 454 sequencing work-flow • Use nearly 3 mill CPU hrs. in 6 mo. • Pre-compiled binaries allow advanced optimizations, e.g. use of GPUs and MPI, transparently to the users
ICT services for hum-soc • Qualitative methods • Used extensively in humanities and social sciences • Rich media (audio, video) • Typical applications: NVIVO, HyperResearch, Transana • Quantitative methods • Statistics • Potentially huge datasets • Sometimes sensitive data • Typical applications: STATA, SPSS, R • Storage services (data intensive) • Big need for training
eInfrastructure and services for sensitive data • Sensitive data enters in many fields • Life Science • Medicine • Psychology • Social studies • Pedagogic studies • Lack of eInfrastructure and services for sensitive research data impairs ability to perform research
Sensitive research data DNA-sequencing Industrial research Video/audio Patient/clinical MRI Questionnaires Genetics
eInfrastructure and services in the future • This is the missing slide about clouds and virtualization
Thanks for your attention! Questions