280 likes | 396 Views
Supporting Research With Flexible Computation Resources . - Federating Clouds in the UK NES , Oxford e-Research Centre and leading to EGI. David Wallom Associate Director – Innovation, Oxford e-Research Centre Technical Director – UK NES Former VP Communities - OGF.
E N D
Supporting Research With Flexible Computation Resources - Federating Clouds in the UK NES, Oxford e-Research Centre and leading to EGI David Wallom Associate Director – Innovation, Oxford e-Research Centre Technical Director – UK NES Former VP Communities - OGF SDCD 2012: Supporting Science with Cloud Computing 19th November 2012
UK NGS Cloud Activities • NGS Agile Deployment Environments EPSRC funded, 2 years • Staff: • David Wallom (OeRC, Oxford); • David Fergusson (NeSC, Edinburgh); • Steve Thorn (NeSC, Edinburgh); • Matteo Turilli (OeRC, Oxford). • Goals: • EC2 compatible, open source solution; • development of a dedicated pool of images, supporting both end user and NGS requirements such as training; • collecting data about feasibility, costs, stability; • identify use cases and gather further requirements.
Evaluation criteria Funding Scalability Flexibility Maintenance Support Accountability Obsolescence Competitiveness Security Cloud Infrastructure for Research Centralisation vs Federation • Centralisation: one large, dedicated datacentre that serves the national HEI demand • Federation: heterogeneous set of local infrastructures coordinated nationally in order to satisfy the HEI demand
Eucalyptus Vs Nimbus, OpenNebula, OpenStack Eucalyptus Pros • Very good implementation of EC2 and EBS APIs; • Enterprise support offered by Canonical through UEC; • Dedicated installation in UEC; • Modular design; • Xen and KVM compatible; • Open source and commercial. Eucalyptus Cons • Design limitations; • AAA. The others • Limited EC2 API implementation; • No native support for EBS; • Globus WS4 (Nimbus); • Early development stage; • Slow development. • To keep an eye on • OpenNebula 2.2 (to be tested); • OpenStack Compute and OpenStack Object Storage.
NGS Cloud Prototypes Oxford III • 6 x 2 AMD 2 core; 8GB ram. • 1 x 4 AMD 2 core; 32GB ram. • CentOS 5.4; • Eucalyptus 1.6.2 installed from rpm repositories; • Ganglia and Nagios monitoring systems; • 5 default VM templates = 44/44/22/22/11 VMs (editable); • 2TB ECB, 80GB Walrus.
NGS Cloud Prototypes Oxford IV • 3 x 4 Xeon 6 core; 48GB ram. • 2 x 1 Xeon 2 core; 32GB ram. • Ubuntu 10.10; • Ubuntu Enterprise Cloud; • 2+2 bounded public NICs on CC; • 12TB ECB, 12TB Walrus on SED disks; • TPM on every motherboard.
NGS Cloud Prototypes Edinburgh II • 32 x Sun Fire X4100 • Dual-core, 2.8 GHz Opteron 8 GB RAM, 70 GB RAID1 • 64 cores • 1 Headnode (Cloud and Cluster controllers • 31 Nodes (Node controller) • Max 2 VMs per core: 124 slots (2GB RAM) • VLANs for VM isolation
Managing and Monitoring Tools • Hybridfox + euca-tools: overall cloud usage and status + testing; • Landscape: canonical, not open-source management solution for UEC. Did not try RightScale as fairly expensive and hosted service; • Linux CLI: dedicated scripts to monitor logs and daemons status. Issues • Public IP Database corruption (addressed in version 2); • No user quota on the open source version of Eucalyptus; • No accounting on the open source version of Eucalyptus; • VERY verbose, none persistent logs; • Lack of error feedback in some conditions.
User Support Tools • Ticketing system: web-based platform (footprints). Addressed around 200 tickets in 1 year; • Web site: subscription instructions, links to Eucalyptus documentation and to the support e-mail; • Mailing list: used mainly to announce new services, scheduled or unscheduled downtime, planned upgrades. Issues • Access through institutional firewall via proxy; • Available resources (limitation of Eucalyptus design); • Instructions on how to build a dedicated image; • Almost no issues about research and cloud computing. • Difficult to manage user access with separate cloud systems…
NGS Cloud Usage 2010/2011 • 106 registered users: uptake has been very fast and users stayed engaged throughout the whole testing period; • 26 institutions: 23 HEI both universities and colleges, 3 companies; • 30 projects; • 10 research areas. • Teaching • Physics • Ecology • Geography • Life sciences • Medicine • Social Science • Mathematics • Engineering • Cloud R&D
Exemplar Case Studies • Evolutionary Genomics: “analysis and Information management of Next Generation Sequencing (NGS) of Genomic data poses many challenges in terms of time and size. We are exploring the translation of high quality NGS scientific analysis pipelines to make best use of Cloud infrastructure”; • Geospatial Science: “geospatial data is a mix of raster and vector data. As rasterizing is CPU-hungry process, and all maps displayed on the screen of the final user are rasters, it is more efficient to do the process on the server side. I am investigating how this process can be dispersed across many, if not unlimited instances in a cloud”; • Agent-based modelling of crime: “at the moment I have a tomcat server that hosts some web services used to run social simulation model, it needs access to the file system to run fortran scripts, create files etc. There are loads of problems with running our own server at uni and I think a virtual machine that I could have control over would be much better”.
Flexible Services for the Support of Research (FleSSR) 6 Partners • Academic and industrial; • 3 cloud infrastructures. Goals Building federated cloud infrastructure, extending the use of UK NGS central services with cloud brokering and accounting. Use cases • Multi Platform Software Development; • On demand Research data storage.
FleSSR Architecture Zeel/i Broker Reading Oxford STFC/NES Accounting Database Eduserv
FleSSR Infrastructure • Local/Global: services depends either on local or global access. Cloud brokering is not mandatory for AWS-like service access; • Multiple identities: every user may have multiple identities, both local and global; • Only personal identities: group identities are not implemented. The management of every single identity is left to the legally responsible user; • Multiple AA technologies: AA may differ depending on local and global policies/technologies; • Multiple accounting: every single identity is accounted for its usage. Every individual may get multiple invoices.
FleSSR Use Case: Multi Platform Software Development Zeel/i Broker Instance configuration manager Build manager CVS / SVN repository FleSSR cloud Build instance 1 Build instance 2 Build instance 3 Build instance 4 Build instance 5
FleSSR Use Case: On demand Research data storage Zeel/i Broker Volume Manager FleSSR cloud EBS Volume VM EBS Interface
FleSSR Output Code • Instance configuration and build manager: Perl command line utility + Java client utilising the Zeel/I API; • Personal EBS volume manager: web-based, Java client for EBS volumes handling + tailored VM image with multiple data interfaces (SFTP, WebDAV, GlusterFS, rsync, ssh); • Eucalyptus open-source accounting system: Perl aggregators and parsers for standard eucalyptus open-source log files + MySQL accounting database + PHP accounting client. Use cases • SKA community testing of Use case; • Institutional ICT team testing WEB-DAV, GridFTP & GlusterFS solution as Use case 2.
Aiming to support multiple heterogeneous user communities, the EGI Federated Cloud Task Force With thanks to MatteoTurilli, EGI FCTF Chair
EGI New Challenges and Cloud Computing Personalised environments for individual research communities in the European Research Area. Community Services Community Services EGI.eu Coordination Core software and support Globus Community Platform dCache gLite ARC Globus UNICORE VM Mgmt Data Image Sharing VM Mgmt Data Image Sharing VM Mgmt Data Image Sharing NGI NGI NGI Commercial IaaS Monitoring Accounting Notification Monitoring Accounting Notification Monitoring Accounting Notification Monitoring Accounting Notification EGI-wide message bus With thanks to MatteoTurilli, EGI FCTF Chair
Task Force Members and Technologies TUD SARA Utrecht KTH EGI.eu DANTE GWDG • Members • 63 individuals. • 23 institutions. • 13 countries. • Technologies • 7 OpenNebula. • 3 StratusLab. • 3 OpenStack. • 1 Okeanos. • 1 WNoDeS. TCD FZJ STFC CESNET OeRC Cyfronet SixSq Masaryk • Stakeholders • 15 Resource Providers. • 7 Technology Providers. • 6User Communities. • 3 Liaisons. CNRS LMU SRCE FCTSG INFN BSC IFAE GRNET With thanks to MatteoTurilli, EGI FCTF Chair
Federation Model • Standards and validation: emerging standards for the interfaces and images – OCCI, CDMI, OVF. • Resource integration: Cloud Computing to be integrated into the existing production infrastructure. • Heterogeneous implementation: no mandate on the cloud technology. • Provider agnosticism: the only condition to federate resources is to expose the chosen interfaces and services. User Communities User Communities User Communities Federated interfaces Federated services Cloud Management Cloud Management Cloud Management Cloud Management Cloud Management Hardware Hardware Hardware Hardware Hardware With thanks to MatteoTurilli, EGI FCTF Chair
Federation Test bed – Sep 2012 Composed of 4 services, 2 management interfaces, 7 cloud infrastructures operated by 6 Resource Providers. 3 more providers are in the process of being federated.
Federation Demo – Sep 2012 Information GLUE 2.0 BDII Resource Provider GWDG (ON/OS) LDAP OCCI 1.1 CDMI 1.0 MP/UR Clients OCCI 1.1 LDAP Resource Provider CESNET (ON) CDMI 1.0 MP/UR Clients Monitoring Nagios LDAP Resource Provider CYFRONET (ON) OCCI 1.1 MP/UR Clients LDAP Resource Provider KTH (ON) OCCI 1.1 VM metadata Marketplace CDMI 1.0 MP/UR Clients LDAP Resource Provider CESGA (ON) OCCI 1.1 MP/UR Clients Accounting OGF UR UR+ & StAR Resource Provider Venus-C CDMI 1.0 ON = OpenNebula. OS = OpenStack. MP = Marketplace. UR = Usage Records. LDAP Resource Provider FZJ (OS) OCCI 1.1 MP/UR Clients LDAP Resource Provider IN2P3-CC (OS) OCCI 1.1 MP/UR Clients Message Bus With thanks to MatteoTurilli, EGI FCTF Chair
Use Cases • Structural biology – We-NMR project: Gromacs training environments. • Musicology– Peachnote project:music score search engine and analysis platform. • Linguistics – CLARIN project: scalable ‘British National Corpus’ service (BNCWeb). • Ecology – BioVel project: remote hosting of OpenModeller service. • Software development– SCI-BUS project: simulated environments for portal testing. • Space science – ASTRA-GAIA project: data integration with scalable workflows. With thanks to MatteoTurilli, EGI FCTF Chair
EGI FCTF Conclusions Output • Adoption of standards for VM and data management. • Interoperability across multiple cloud management platforms. • Federation model compatible and consistent with current EGI infrastructure. • Contribution to EGI user communities engagement and support. • Documentation made available to the community. Cycle #3, Sep 2012 – Mar 2013: Integration • Focus on dev tools for management interfaces and clients for the test bed. • Integration of the test bed services into the EGI infrastructure. • Cloud brokering evaluation and deployment. • Focus on use cases coordination and implementation. • Opening of the test bed to early adopters. With thanks to MatteoTurilli, EGI FCTF Chair
Usage so far • Compute Capacity • >900 VM slots • Data • ~16TB • Marketplace • 11 VM templates stored and available • VM instantiation/Usage • >3200 VMs (Accounted for in EGI central accounting facility) With thanks to MatteoTurilli, EGI FCTF Chair
Federation Conclusions • Utilisation of virtual infrastructure is the only scalable method to support large number of disparate user communities across multiple different application design models • Federation as robust and scalable model of national/European cloud infrastructure for research, • Federation is only possible by the availability of open standards, • Successful pilot tests of multiple prototypes of cloud infrastructure allowed a quicker development of the final model for EGI, • Crucial role played by Research & Development in order to customise open-source cloud infrastructure solutions to the specific needs of academic research, • Cloud is part of an ecosystem of e-infrastructure not e-infrastructure alone.