1 / 42

EPOS e-Infrastructure

EPOS e-Infrastructure. Keith G Jeffery Natural Environment Research Council keith.jeffery@stfc.ac.uk. Structure of Presentation. Who? EPOS Rationale and approach e-Infrastructure Basics Related Projects (Torild van Eck) Proposed Approach ICT Board Conclusion. Who?.

infinity
Download Presentation

EPOS e-Infrastructure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EPOS e-Infrastructure Keith G Jeffery Natural Environment Research Council keith.jeffery@stfc.ac.uk

  2. Structure of Presentation • Who? • EPOS Rationale and approach • e-Infrastructure Basics • Related Projects (Torild van Eck) • Proposed Approach • ICT Board • Conclusion

  3. Who? • STFC Director IT & International Strategy • Note STFC runs large research facilities • Author of original paper on e-Science 1999 • Led to a £500m programme in UK • Led to large EC-funded programme • Led to ERCIM-led CoreGRIDNoE • Chair EC Expert Group on GRIDs (2002-2007) • Co-Convenor EC-Expert Group on CLOUDs (2009-2010) • Executive Secretary of ERF (national facilities / international access) www.europeanresearchfacilities.eu • President ERCIM www.ercim.org • President euroCRIS www.eurocris.org • Chair Alliance for Permanent Access to the Records of Science www.alliancepermanentaccess.eu • Board Member EOS (Enabling Open Access) www.openscholarship.org

  4. BUT....(important for EPOS) • STFC Director IT & International Strategy • Note STFC runs large research facilities • Author of original paper on e-Science 1999 • Led to a £500m programme in UK • Led to large EC-funded programme • Led to ERCIM-led CoreGRID NoE • Chair EC Expert Group on GRIDs (2002-2007) • Co-Convenor EC-Expert Group on CLOUDs (2009-2010) • Executive Secretary of ERF (national facilities / international access) • President ERCIM www.ercim.org • President euroCRIS www.eurocris.org • Chair Alliance for Permanent Access to the Records of Science www.alliancepermanentaccess.eu • Board Member EOS (Enabling Open Access) www.openscholarship.org • BSc (1968)and PhD (1971) are in Geology • (the PhD with a large IT content)

  5. Rutherford Appleton Laboratory STFC Rutherford Appleton Laboratory

  6. Summer vacation: Tectonics of Ticino

  7. Structure of Presentation • Who? • EPOS Rationale and approach • e-Infrastructure Basics • Related Projects (Torild van Eck) • Proposed Approach • ICT Board • Conclusion

  8. EPOS Rationale

  9. EPOS Concept Massimo Cocco

  10. Structure of Presentation • Who? • EPOS Rationale and approach • e-Infrastructure Basics • Related Projects (Torild van Eck) • Proposed Approach • ICT Board • Conclusion

  11. e-Infrastructure Basics • GRIDs • Clouds • Web 2.0 • SOA (Service-Oriented Architecture) • Research process • Fourth paradigm (Data Intensive Scientific Discovery) • Virtualisation • Autonomicity • Security, Privacy, Trust • Performance • Development • Maintenance

  12. Internet 1.5 billion fixed connections Estimated 4 billion mobile connections Digital Storage Estimated 280 billion Gigabytes (280 exabytes – 280*10**18) Expect all to grow ~ 1 order of magnitude in 4 years and accelerating) Users : Asia 550 million 14% penetration Europe 350 million 50% penetration USA 250 million 70% penetration Scalability Trust & security & privacy Manageability Accessability Useability Representativity CONTEXT Last 20 years CPU 10**16 Storage 10**18 Networks 10**4

  13. The GRIDs Vision • The end-user interacts with the GRIDs environment to clarify the request • using a ‘device’ or ‘appliance’ • The GRIDs environment proposes a ‘deal’ to satisfy the request • which may or may not involve money • The user accepts or rejects the ‘deal’

  14. The GRIDs Vision • The GRIDs environment is such that • A user can interact with it intelligently • It provides transparent access to • data, information, knowledge • computation • instrumentation / detectors • http://epubs.cclrc.ac.uk/work-details?w=28736

  15. Knowledge Layer Information Layer Data toKnowledge Control Computation / Data Layer The GRIDs Architecture: Layering The GRIDs Architecture

  16. U:USER R:RESOURCE S:SOURCE A POSSIBLE ARCHITECTURE The GRIDs Architecture: Components The GRIDs Environment Um:User Metadata Ua:User Agent Sm:Source Metadata Sa:Source Agent Ra:Resource Agent Rm:Resource Metadata brokers

  17. Cloud Computing: The Technology • A very large number of processors • Clustered in racks as blades • In one major computer centre • May be replicated for business continuity • With massive online storage • RAID for resilience • And excellent communications links • For access

  18. Cloud Computing: The Intention • Low cost of entry for customers • Device and location independence • Capacity at reasonable cost (performance, space) • Cloud Operator manages resource sharing balancing different peak loads • Scalable as demand rises from user • Security due to data centralisation and software centralisation • Sustainable and environmentally friendly – concentrated power •  it is a service and the user does not know or care from where, by whom, and how it is provided •  as long as the SLA (service level agreement) is satisfied

  19. Cloud Computing: What is it? • Is cluster computing • with the advantages that brings performance, scheduling • With GRIDs features • Scheduling / resource allocation • self-* • virtualisation • ASP (Application Service Provider) service • Can be used: • for infrequently required supercomputing • for business continuity / disaster recovery • for total ICT outsourced solution

  20. Cloud Problems • Proprietary offerings (Amazon, Google, Microsoft...) so lock-in • Interoperation attempt failed • Inefficient to move data to the cloud • Despite SLA/QoS guarantees some concerns • performance • security/trust/privacy •  So maybe GRID of proprietary CLOUDs?

  21. Web 2.0 • Features: • creativity, communications, secure information sharing, collaboration and functionality • Examples: • Social networking, video-sharing, wikis, blogs, folksonomies • Crowdsourcing to gather information / knowledge wisdom? If you don’t know what Web2.0 is your kids do!

  22. XML datastreams mobile code (Java) plug-ins on browsers (commonly) an easy-to-use software development environment Hierarchic structure linearised – inefficient, inexpressive Is this the best language – procedural and low-level Security implications and increasing memory requirements Usually rather informal Web 2.0 Based on web services

  23. Web 2.0: Criticism • Uses Web 1.0 technology –nothing really new • i.e. http, URI, html/xml • Ideas not new • see Lotus Notes / Domino, videoconferencing etc (CSCW) before Web 2.0 • Tim Berners-Lee dismisses it as hype

  24. Input Parameter definitions Service description (descriptive metadata) Output Parameter definitions Functional Program Code (to deliver the service) Restrictions on use of service (restrictive metadata) Service Oriented Architecture Services: Challenges 1 • Location • Requirements matching • Composing • Utilising  metadata

  25. Service Oriented Architecture Services: Challenges 2 Multiple Instances Parallel execution Composition • End-to-end FR satisfaction • End-to-end NFRs satisfaction • Avoiding emergent properties • Conditions of use of services

  26. server server server server Bringing it Together: e-,i-,k-infrastructure k- Deduction & induction – human or machine i- Information Systems e- server Physical detectors

  27. Middleware – and as SOKUs (Service-Oriented Knowledge Utilities) k- K- upper middleware (resolves semantic heterogeneity) K- lower middleware (presents declared semantics) i- Upper middleware (hides syntactic heterogeneity) Lower middleware (hides physical heterogeneity) e-

  28. Research Process: 4th Paradigm Observational Science Experimental Science Modelling Science Hypothesis Characterisation Simulation/modelling Observations Contextual metadata Pre-processing Digital preservation Availability Analysis Visualisation Hypothesis Experimentation Observations Contextual metadata Pre-processing Digital preservation Availability Analysis Visualisation • Observations • Contextual metadata • Pre-processing • Digital preservation • Availability • Analysis • Visualisation (Concept from Jim Gray 1944-2007) DATA-INTENSIVE SCIENCE

  29. Structure of Presentation • Who? • EPOS Rationale and approach • e-Infrastructure Basics • Related Projects (Torild van Eck) • Proposed Approach • ICT Board • Conclusion

  30. Related Projects EPOS e-infrastructure has to fit in with • ESFRI Roadmap projects in Environmental Cluster (Wouter Los) • ESFRI roadmap projects in other clusters • Physical sciences (STM) • Economic/social science • Arts and humanities • PRACE (supercomputing) • National e-infrastructures for e-Research • Especially geoscience • Other international projects (Noth America, Pacific Rim, South America...)  From Torild van Eck

  31. EPOS IT relevant EC-project projects + proposal (summary) GEM Hazard EC projects starting 2010 SHARE Hazard ETHZ (D. Giardini) NERA Seismology & Earthq Eng. ETHZ + ORFEUS/KNMI (D. Giardini; T. van Eck) EPOS PP INGV (Massimo Cocco) QUEST (Training network) Computational Seismology LMU (H. Igel) EPOS (ESFRI roadmap) VERCE IPGP (J-P Vilotte) ORFEUS/KNMI EMSC INGV LMU Univ Liverpool BAW CINECA Fraunhofer UoE (IT) INFRA-2011-1.2.1 EUDAT CSC Finland (Kimmo Koski) EPOS (GFZ, INGV) LifeWatch … CINECA (IT) UoE (IT) … INFRA-2011-1.2.2 ENVRI LifeWatch (Wouter Los) EPOS (ORFEUS/KNMI) LifeWatch EPOS EMSO EISCAT ICOS … STFC (IT) UoE (IT) … INFRA-2011-2.3.3 Deadline: Nov 23 Deadline: Nov 23 Deadline: Nov 25 Project proposals 2010 INFRASTR. 2011-1 Call 8/9

  32. Data providers & Users Humans & Instruments Roles Sensors Curators Researchers Observers Aggregators Public Functionalities Virtual Environments & Collaborative organisations Security & Protection Three layers (slide by Peter Wittenburg and Wouter Los) Data generators & Users Data discovery & Navigation (meta) data tagging tools Data submission tools Operational Semantic Interoperability Workflow Generator Data correlation Knowledge management Virtualisation Community Support Services Data Services Persistant storage capacity 24/7 operation Preservation & Sustainability Authenticity Certification & Integrity GUIDs Generic interoperability Technical Legal Semantic

  33. Structure of Presentation • Who? • EPOS Rationale and approach • e-Infrastructure Basics • Related Projects (Torild van Eck) • Proposed Approach • ICT Board • Conclusion

  34. e-Infrastructure Requirement • Data collection, calibration, validation • Data cataloguing and indexing • Data preservation and curation • Information processing – retrieval, analysis, visualisation • Hypothesis processing – simulation, modelling, analysis, visualisation • Hypothesis generation – data mining • Knowledge processing – integration of ICT with human processing – theory processing, user interface, scholarly communication (open access) • External interoperation – physical and medical sciences, economic and social sciences, arts and humanities • Dissemination – outreach (website plus) • Education and training • Management and Coordination

  35. Key e-Infrastructure Principles • Mobile code: ability to move code to data because data large and costly to transport • Virtualisation: user neither knows nor cares where computing done or where data located as long as QoS/SLA met • Autonomicity: (self-*) because human management of ICT too expensive / slow

  36. Key e-Infrastructure Challenges • Interoperation • Access to heterogeneous distributed data sources • Schema integration – syntactic and semantic • Security/privacy/trust • Identification – authentication – authorisation – accounting • Performance • Towards exascale processing (simulation/modelling) • Towards exabyte data streams (1.0*10**18)

  37. Steps to achieve EPOS e-Infrastructure1 • Define / Agree requirements of end-user (document dynamically) • Including expected future requirements • Survey available data/information sources (document dynamically) • Detector systems • Repositories / databases / file systems • Data, documents, metadata, contextual data • Conditions of use – QoS, SLA (link to governance) • Define schema mappings, convertors for interoperation (document dynamically) • Canonical interoperation standard? • Note CERIF (Common European Research Information Format)

  38. Steps to achieve EPOS e-Infrastructure2 • Survey available computing and computation resources (document dynamically) • Detector systems • Data servers • HPC • Conditions of use – QoS, SLA (link to governance) • Define access and utilisation of ICT (document dynamically) • User identification, authentication, authorisation, accounting (security, privacy) • Available services • Conditions of use – QoS, SLA (link to governance) • Design first-cut ICT architecture (document dynamically) • GEANT network • GRIDs (EGI) middleware • Web services software • Web portal(s) user interface

  39. Structure of Presentation • Who? • EPOS Rationale and approach • e-Infrastructure Basics • Related Projects (Torild van Eck) • Proposed Approach • ICT Board • Conclusion

  40. ICT Board The ICT Board will provide expert support and evaluate the e-infrastructure implementation plan as well as it will oversee the development of an architectural model for the EPOS infrastructure. In particular, the ICT Board will evaluate the data flow organization, the workflow management and the user interface definition and services. The board will consist of experts nominated through ERCIM (www.ercim.org) to ensure quality and integration with other e-infrastructure initiatives.

  41. Structure of Presentation • Who? • EPOS Rationale and approach • e-Infrastructure Basics • Related Projects (Torild van Eck) • Proposed Approach • ICT Board • Conclusion

  42. Conclusion(take-home messages) • EPOS is a HUGE CHALLENGE • EPOS requires LEADING EDGE ICT to support LEADING EDGE GEOSCIENCE • EPOS e-Infrastructure is the ‘GLUE’ • EPOS is going to be FUN! ********* Prof Keith G Jeffery CEng, CITP, FGS, FBCS, HFICS keith.jeffery@stfc.ac.uk

More Related