1 / 54

Day 2 Tuesday 13th Sept 2011 Time Title Speaker

Day 2 Tuesday 13th Sept 2011 Time Title Speaker 9:00am – 10:30am Choosing the best e-infrastructure for your application David Wallom, NGS Technical Director, University of Oxford 10:30am – 11am Coffee break

blithe
Download Presentation

Day 2 Tuesday 13th Sept 2011 Time Title Speaker

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Day 2 Tuesday 13th Sept 2011 Time Title Speaker 9:00am – 10:30am Choosing the best e-infrastructure for your application David Wallom, NGS Technical Director, University of Oxford 10:30am – 11am Coffee break 11am – 12:30pm Grid & Running jobs on the NGS John Kewley, NGS Support Centre Manager & Jon Churchill, NGS 12:30pm – 1:30pm Lunch 1:30pm – 2:30pm Hector national resource Adrian Jackson, Principal Consultant, EPCC, University of Edinburgh 2:30pm – 3:30pm Computational science & engineering support Ian Reid, Chief Commercial Officer, NAG 3:30pm – 4pm Coffee break 4pm – 5.30pm Clouds Practical; Steve Thorn, Research Systems Consultant, University of Edinburgh, Richard Tarrant, Information Systems University of Reading

  2. Software development patterns,SOA, and using different e-infrastructure David Wallom University of Oxford

  3. Agenda • Standards • Service Oriented Architecture • HPC • Grids • Clouds

  4. 4 Why Standards? • Essential aspect of modern industrialized society • Permeate every aspect of life, • SAE lubrication standards, • standard size screws and sockets, • voltage standards, • standard battery sizes • Allow consumers choice • For vendors it commoditizes the product, • Facilitates the formation of markets • Allows vendors to compete effectively against others on aspects such as price, quality, or delivery time.

  5. What is Service Oriented Architecture (SOA)? • An SOA application is a composition of services • A “service” is the atomic unit of an SOA • Services encapsulate a business process • Service Providers Register themselves • Service use involves: Find, Bind, Execute • Most well-known instance is Web Services Service Registry Find Register Service Consumer Service Provider Bind, Execute

  6. 6 Guiding Principles for SOA • reuse, granularity, modularity, composability, componentization and interoperability. • standards-compliance (both common and industry-specific). • services identification and categorization, provisioning and delivery, and monitoring and tracking.

  7. SOA Benefits Business Benefits • Focus on Business Domain solutions • Leverage Existing Infrastructure • Agility • Increased usability Technical Benefits • Loose Coupling • Autonomous Service • Location Transparency • Late Binding

  8. 8 Web Services SOA Standards • WSDL • UDDI • BPEL • WS-Profile • WS-Security • WS-Choreography And many others…

  9. 9 Service Composition

  10. 10

  11. Mashup • Collection of various different sets of data from different sources, presented through a common interface • Currently either using public data or specific datasets owned by the user • In the future could be done through specific policy and authentication to other wide spread users secure data

  12. Advanced utilisation of Google Earth • “Mapping for the masses” • According to Nature • Desktop application (Windows and Mac) for displaying geographical data • Satellite images • Earthquake locations • Live data! • All on a 3-D spinning globe • Can view data at all scales • Very easy to incorporate new data • easy as writing a simple Web page Thanks to Jon Blower, ReSC

  13. Real-Time London Tube Information

  14. Choose your school

  15. 15 Service Reuse

  16. e-Diamond • Sharing of digital mamography to allow increased utilisation of limited pool of radiographers • Collaboration between academic computer scientists and real medics in hospitals • Results have shown possibility for this type of scheme to work well in a real medical environment • Have also shown what a headache the current data regulations are!

  17. ENGAGE eSAD Project • Merge outputs from two different projects • Image processing • VRE for the study of ancient documents • Connect a new user community into using the NGS • Using a common problem solving environment MATLAB • Use open standard interfaces between components, GridSAM Thanks to SegoleneTarte, Classics/OeRC

  18. Lowering the barriers to Cancer Imaging • Develop an architecture, and prototype to create dynamically configurable research environments • Develop algorithms for locating the mesorectalfacia for improved ability for successful surgery • Integrate with new user interfaces to allow utilisation by medical communities directly Thanks to Susana Garcia

  19. myExperiment.org is… • A market place. • A community social network. • A gateway to other publishing environments. • A federated repository • A platform for launching workflows. • Publishing self-describing Encapsulated myExperiment Objects. • Mindful publication. • Started March 2007. • Closed beta since July 2007 • Open beta November 2007

  20. Blogging The Lab • Blogging the lab

  21. NeuroHub • JISC project funded to develop an information environment for neuroscientists. • A platform which will allow neuroscientists to efficiently and effectively use ICT infrastructure and by doing so will enable a more productive research cycle. • Aims to streamline the laboratory experience from conception of experiment to publication of research results. Supporting the Research Lifecycle Neurohub aims to support the entire research lifecycle

  22. NeuroHub

  23. Which Resources for which Applications?

  24. Aspects of Research Computing

  25. Aspects of Research Computing

  26. Scale of HPC Compute Resources • Departmental • Single purchaser with little interaction between other possible user communities, Smaller facility generally • Institutional • Shared between all relevant users within a single institution • Larger facility • Community • Dedicated resource for a community that spans multiple institutions, e.g. National Service for Computational chemistry Software • National • Scaled beyond a single institutions requirements, Capability and Capacity computing facility, HeCTOR • International • Tier 0, international scale resource for the largest problems,

  27. Parallel Computing: Definitions • HPC necessarily relies on parallel computing • Parallel computing… • Involves the use of multiple processors simultaneously to reduce the time needed to solve a single computational problem. • Examples of fields where this is important: • Climate modeling, weather forecasting • Aircraft and ship design • Cosmology, simulations of the evolution of stars and galaxies • Molecular dynamics and electronic (quantum) structure • Parallel programming… • Is writing code in a language (plus extensions) that allows you to explicitly indicate how different portions of the computation may be executed concurrently • Therefore, the first step in parallel programming is to identify the parallelism in the way your problem is being solved (algorithm)

  28. Data Parallelism Definition: when independent tasks can apply the same operation to different elements of the data set at the same time. Examples: 2 brothers mow the lawn 8 farmers paint a barn A B B B C 29

  29. Functional Parallelism Definition: when independent tasks can apply different operations to the same (or different) data elements at the same time. Examples: 2 brothers do yard work (1 rakes, 1 mows) 8 farmers build a barn A B C D E 30

  30. Standards in OGF HPC Basic Profile HPC Specific Job Management HPC Basic Profile (OGF GFD 114) Education ISV Primer (GFD 141) References Combines Job Definition Security WS-Security (OASIS) Security SSL/TLS (IETF) Job Management OGSA-BES (OGF GFD 108) Job Description JSDL (OGF GFD 136) Uses Provides Access To Extend Application Description HPC Application (OGF GFD 111) File Transfer HPC File Staging (OGF GFD 135) Compute Resources

  31. HPC-BP Implementations • Platform Computing LSF • Altair PBS • Microsoft HPC Server 2008 • UNICORE • NorduGrid ARC • (EGEE Cream-BES) • UVa GENESIS-II • OMII-UK GridSAM • eBay Research (Hadoop) • All OGF standards implementations; • http://www.ogf.org/gf/page.php?page=Standards::Implementations 32

  32. Questions • What type of problem am I trying to solve? • What scale is my problem? • Do I wish to share my application with collaborators? • Am I using disparate data sources or a single local set of data?

  33. Aspects of Research Computing

  34. Task Parallelism Definition: when independent “Worker” tasks can perform functions that do not need to communicate with each other, only with a “Master” or “Manager” process. Such tasks are often called “Embarrassingly Parallel” because they can be parallelized with little extra work Examples: Independent Monte Carlo Simulations ATM Transactions A B C D 35

  35. What is Grid Computing? • “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” • -”The Grid: Blueprint for a New Computing Infrastructure”, Kesselman & Foster • Criteria for a Grid*: • Coordinates resources that are not subject to centralized control. • Uses standard, open, general-purpose protocols and interfaces. • Delivers nontrivial qualities of service. Source: “What is the Grid? A Three Point Checklist”, Ian Foster, Argonne National Laboratory & University of Chicago

  36. Benefits of Grid Computing • Exploit underutilized resources • CPU Scavenging, Hotspot leveling • Resource balancing • ‘Virtualize’ resources across an enterprise • Data Grids, Compute Grids • Enable collaboration for virtual organizations

  37. Operational Scenario Engine flight data Engine flight data Engine flight data Engine flight data London Airport London Airport New York Airport New York Airport Airline office Airline office Diagnostics Diagnostics GRID GRID Centre Centre Maintenance Maintenance Centre Centre US data centre European data centre Thanks to Tom Jackson, University of York

  38. Applications • The BROADEN project supports Rolls-Royce Grid development activities in three key areas: • Computation Fluid Dynamics Computation; CFD is at the heart of engine performance development • Remote Condition Health Monitoring • Agent based services for aftermarket sales support and optimisation Watch the BBC ‘How to build – a Jumbo Jet Engine’ to see the system in action Thanks to Tom Jackson, University of York

  39. Integrative Biology • Bringing together an international consortium of leading biomedical and computing researchers to address two of the most important problems in clinical medicine today • understanding what causes heart failure • how cancer tumors develop and grow. • Full computational simulation to lead from the cellular to the system scale • Each piece split into individual ‘experiments’

  40. Integrative Biology Virtual Research Environment • Creation of a portal to allow each group to continue using their current tools but through a common interface • Each experiment • ad-hoc collaboration = virtual organisation

  41. European Grid Infrastructure Status Jan 2011 (yearly increase) • 13800 users: +38% • 288000 LCPUs (cores): +18.5% • 117PB disk: +192.5% • 91.5PB tape: +50 % • 28 million jobs/month: +86.7% • 340 sites: +7.25% • 56 countries: +7.7% • 217 VOs: +24% • 30 active VOs: constant • Archeology • Astronomy • Astrophysics • Civil Protection • Comp. Chemistry • Earth Sciences • Finance • Fusion • Geophysics • High Energy Physics • Life Sciences • Multimedia • Material Sciences • … Cloudscape III - EGI Use Case

  42. Questions • What type of problem am I trying to solve? • What scale is my problem? • Do I wish to share my application with collaborators? • Am I using disparate data sources or a single local set of data? • Do I have restrictions on where my problem can be located, inside or outside of my own institution?

  43. Aspects of Research Computing

  44. What is the Cloud? Typically three types are described • Infrastructure as a Service (IaaS) • provide (virtualized) resources on demand • Amazon AWS, Eucalyptus, Rightscale, GoGrid • Platform as a Service (PaaS) • build applications using a provided toolkit so that the application can be run in the Cloud • Google App Engine, Microsoft Azure • Software as a Service (SaaS) • offer an entire application “in the cloud” • Salesforce.com

  45. How Do Grids and Clouds Relate? • Grids came from “big science” and the desire to collaborate in a federated environment • Manage sharing of resources • New technologies developed to cope • Clouds are coming from industry and the desire to dynamically provision resources in the cloud • Simple APIs for using abstracted or virtualized resources • Already using existing technologies • Economies of scale in the data center • Aka, utility computing, internet computing, … • “Grids are an access model; Clouds are a business model” • Chris Smith, Platform Computing, OGF VP Standards • Distributed applications need and can use capabilities being developed under both grid and cloud • There is no real grid vs. cloud dichotomy

  46. Open Cloud Computing Interface Management layer standardisation for IaaS Based on simple abstraction of operations - create, retrieve, update and delete Started Mar’09, standard going for recommendation next month

  47. Distributed Computing Patterns • “It’s déjà vu all over again” • The lessons learned from solving previous problems in Grids can be applied to Cloud computing (e.g. identity federation) • Use the well known concept of “design patterns” to help capture distributed computing best practices

  48. 49 Patterns in Practice

  49. No Shortage of Challenges Remain • For each domain the following considerations must be made; • Data access and interoperability • Must be done at the application domain level, by the domain users • Security • Different models will expose different security threats • Reliability • Managing redundancy, live migration, etc., across the infrastructure • Frameworks • How to manage sets of resources, e.g., VMs and VOs? • Performance management • What job mix needs to be supported, e.g., e-commerce, HPC, transactional, database, data streaming? • Costing models • How to compare your own infrastructure costs with a cloud providers?

More Related