660 likes | 778 Views
Grid Computing and the Gridbus Middleware: Making the Global Cyberinfrastructure for e-Science and e-Business Applications a Reality.
E N D
Grid Computing and the Gridbus Middleware:Making the Global Cyberinfrastructure for e-Science and e-Business Applications a Reality Grid Computing and Distributed Systems (GRIDS) LaboratoryDept. of Computer Science and Software EngineeringThe University of Melbourne, Australiawww.buyya.comwww.gridbus.org Dr. Rajkumar Buyya
GRIDS Lab @ Melbourne R & D Education • Youngest and one of the rapidly growing research labs in our School/University: • Founded in 2002 • Houses: • Research Fellows (3) • Research Programmers (3) • PhD candidates (10) • Honours/Masters students (5+) • Funding • National and International organizations • Australian Research Council • Many industries (Sun, StorageTek, Microsoft, IBM, Microsoft) • University-wide collaboration: • Faculties of Science, Engineering, and Medicine • Many national and international collaborations. • Academics • Industries • Software: • Widely in academic and industrial users. • Publication: • My research team over 20% of our Dept’s research output. + Community Services: e.g., IEEE TC for Scalable Computing
Agenda • Introduction • Utility Networks and Grid Computing • Application Drivers and Various Types of Grid Services • Global Grids and Challenges • Security, resource management, pricing models, … • Service Oriented Grids and Grid Economy • SOGA, Grid Market Directory, Grid Bank, Broker • Grid Service Broker • Architecture, Design and Implementation • Performance Evaluation: Experiments in Creation and Deployment of Applications on Global Grids • A Case Study in High Energy Physics • Summary and Conclusion
Water Distribution Network (1) Water (2) Electricity (3) Gas Telecom Networks (4) Telephone 4 Essential Utilities and Delivery Networks
(5) Computing Grid: Delivery IT services as the 5th utility (Power Grid inspiration) eScience eBusiness eGovernment eHealth Multilingual eEducation …
Power Grid Inspiration: Seamlessly delivering electricity as a utility to users
Computing and Communication Technologies Evolution & Timeline Control Decentralised Centralised * HTC * P2P * PDAs Minicomputers * * PCs * Workstations * Mainframes * Computing Utility * Grids COMPUTING * PC Clusters * Crays * MPPs * WS Clusters * XEROX PARC worm * e-Science * e-Business * IETF * W3C * TCP/IP Communication * Ethernet * HTML * Mosaic * Web Services * Email * Sputnik * SocialNet * Internet Era * WWW Era * XML * ARPANET 2010 1960 1970 1975 1980 1985 1990 1995 2000
2100 2100 2100 2100 2100 2100 2100 2100 2100 Computing is Scaling: Towards Inter-Planetary Level SERV ICES + PERFORMANCE Administrative Barriers • Individual • Group • Department • Campus • State • National • Globe • Inter Planet • Universe Personal Device SMPs or SuperComputers Global Grid Inter Planet Grid Local Cluster Enterprise Cluster/Grid
What is Grid?(there are several academic definitions, here is ours) • A type of parallel and distributed system that enables the sharing, exchange, selection, & aggregation of geographically distributed “autonomous” resources: • Computers – PCs, workstations, clusters, supercomputers, laptops, notebooks, mobile devices, PDA, etc; • Software – e.g., ASPs renting expensive special purpose applications on demand; • Catalogued data and databases – e.g. transparent access to human genome database; • Special devices/instruments – e.g., radio telescope – SETI@Home searching for life in galaxy. • People/collaborators. depending on their availability, capability, cost, and user QoS requirements. Widearea
database How does Grids look like?A Bird Eye View of a Global Grid Grid Information Service Grid Resource Broker Application R2 R3 R4 R5 RN Grid Resource Broker R6 R1 Resource Broker Grid Information Service
Classes of Grid Services / Types of Grids • Computational Services – CPU cycles • Pooling computing power: SETI@Home, TeraGrid, AusGrid, ChinaGrid, IndiaGrid, UK Grid,… • Data Services • Collaborative data sharing generated by instruments, sensors, persons: LHC Grid, Napster • Application Services • Access to remote software/libraries and license management—NetSolve • Interaction Services • eLearning, Virtual Tables, Group Communication (Access Grid), Gaming • Knowledge Services • The way knowledge is acquired, processed and managed—data mining. • Utility Computing Services • Towards a market-based Grid computing: Leasing and delivering Grid services as ICT utilities. Utility Grid Users Knowledge Grid Interaction Grid ASP Grid Data Grid infrastructure Computational Grid
How Are Grids Used? Utility computing High-performance computing Collaborative design Financial modeling High-energy physics E-Business Drug discovery Life sciences Data center automation E-Science Natural language processing & Data Mining Collaborative data-sharing
Analysis Results Analysis Results 1. [Grid Use in Science] Online Medical Instrumentation and Neuroscience DV transfer Osaka Univ. • Virtual Laboratory • for medicine and brain science • Knowledge sharing • MEG sharing? • Data Sharing Data Generation Osaka Univ. Hospital Data Analysis Life-electronics laboratory, AIST Cybermedia Center • Provision of MEG • Provision of expertise in • the analysis of brain function A
Traditional Model Grid-based Model 2. [Grid Use in Business] Enterprise Computing Application Service Virtualization Layer & Load Balancing Email server Web server Database server Apps server Upgrade to a new server to handle more users Utilise IT infrastructure effectively
Agenda • Introduction • Utility Networks and Grid Computing • Application Drivers and Various Types of Grid Services • Global Grids and Challenges • Security, resource management, pricing models, … • Service Oriented Grids and Grid Economy • SOGA, Grid Market Directory, Grid Bank, Broker • Grid Service Broker • Architecture, Design and Implementation • Performance Evaluation: Experiments in Creation and Deployment of Applications on Global Grids • A Case Study in High Energy Physics • Summary and Conclusion
Some Characteristics of Grids Numerousresources Owned by multiple organizations & individuals Connected by heterogeneous, multi-level networks Different security requirements & policies Different resource management policies Geographically distributed Unreliable resources and environments Resources are heterogeneous Slide by Hiro
Computational Economy Security Data locality Resource Allocation & Scheduling Uniform Access System Management Resource Discovery Application Construction Network Management Grid Challenges
Australia Nimrod-G Gridbus DISCWorld GrangeNet. APACGrid ARC eResearch Brazil OurGrid, EasyGrid LNCC-Grid + many others China ChinaGrid – Education CNGrid - application Europe UK eScience EU Grids.. and many more... India Garuda Japan NAGERI Korea... N*Grid Singapore NGP USA Globus GridSec AccessGrid TeraGrid Cyberinfrasture and many more... Industry Initiatives IBM On Demand Computing HP Adaptive Computing Sun N1 Microsoft - .NET Oracle 10g Infosys – Enterprise Grid Satyam – Business Grid StorageTek –Grid.. and many more Public Forums Global Grid Forum Australian Grid Forum Conferences: CCGrid Grid HPDC E-Science Some Grid Initiatives Worldwide 27 million 1.3 billion – 3 yrs 2? billion 120million – 5 yrs 450million – 5 yrs 486million – 5 yrs 1.3 billion (Rs) 1 billion – 5 yrs http://www.gridcomputing.com
The Gridbus Project @ Melbourne:Enable Leasing of ICT Services on Demand WWG Gridbus Pushes Grid computing into mainstream computing
Why Grid Economy for Gridbus?: (1) Sustained Resourced Sharing and (2) Effective Management of Shared Resources Grid Economy
Agenda • Introduction • Utility Networks and Grid Computing • Application Drivers and Various Types of Grid Services • Global Grids and Challenges • Security, resource management, pricing models, … • Service Oriented Grids and Grid Economy • SOGA, Grid Market Directory, Grid Bank, Broker.. • Grid Service Broker • Architecture, Design and Implementation • Performance Evaluation: Experiments in Creation and Deployment of Applications on Global Grids • A Case Study in High Energy Physics • Summary and Conclusion
A Reference Service-Oriented Architecture for Utility Grids Data Catalogue Grid Bank Information Service Grid Market Services Sign-on HealthMonitor Info ? Grid Node N … Grid Explorer … Secure ProgrammingEnvironments Job Control Agent Grid Node1 Applications Schedule Advisor QoS Pricing Algorithms Trade Server Trading Trade Manager Accounting Resource Reservation Misc. services … Deployment Agent JobExec Resource Allocation Storage Grid Resource Broker … R1 R2 Rm Grid Middleware Services Grid Consumer Grid Service Providers
CDB PDB Gridbus and Complementary Technologies – realizing Utility Grid Grid Applications … Science Commerce Engineering Collaboratories Portals … ExcellGrid Gridscape Workflow X-Parameter Sweep Lang. MPI User-LevelMiddleware (Grid Tools) … Grid Brokers: Workflow Engine Gridbus Data Broker Nimrod-G Core Grid Middleware Grid MarketDirectory Grid Exchange & Federation Globus Unicore Grid Storage Economy GridBank … Alchemi NorduGrid XGrid GRIDSIM .NET JVM Condor PBS SGE Libra Tomcat Grid Economy Grid Fabric Software Mac Windows Linux AIX IRIX OSF1 Solaris Grid Fabric Hardware Worldwide Grid
Application Code Explore data 1 Visual Application Composer 10 Results+Cost Info 2 GridResource Broker Data Catalogue 5 4 Grid Info Service 12 6 3 ASP Catalogue Grid Market Directory 9 7 Job Results 8 Grid Service (GS) (Globus) Bill Alchemi GS CPU orPE PE GTS 11 GridbusGridBank Cluster Scheduler PE GSP (Accounting Service) GSP (e.g., IBM) GSP (e.g., VPAC) GSP (e.g., UofM) On Demand Assembly of Services: Putting Them All Together
Alchemi: .NET-based Enterprise Grid Platform & Web Services Alchemi Manager Web Services Internet Alchemi Users Internet • SETI@Home like Model • General Purpose • Dedicated/Non-dedicate workers • Role-based Security • .NET and Web Services • C# Implementation • GridThread and Job Model Programming • Easy to setup and use • Widely in use! Alchemi Worker Agents
Some Users of Alchemi Tier Technologies, USA Large scale document processing using Alchemi framework Satyam Computers Applied Research Laboratory, India Micro-array data processing using Alchemi framework CSIRO, Australia Natural Resource Modeling The University of Sao Paulo, Brazil The Alchemi Executor as a Windows Service stochastix GmbH, Germany Serving clients in International Banking/Finance sector The Friedrich Miescher Institute (FMI) for Biomedical Research, Switzerland Patterns of transcription factors in mammalian genes Many users in Universities: See next for an example.
Agenda • Introduction • Utility Networks and Grid Computing • Application Drivers and Various Types of Grid Services • Global Grids and Challenges • Security, resource management, pricing models, … • Service Oriented Grids and Grid Economy • SOGA, Grid Market Directory, Grid Bank, Broker.. • Grid Service Broker • Architecture, Design and Implementation • Performance Evaluation: Experiments in Creation and Deployment of Applications on Global Grids • A Case Study in High Energy Physics • Summary and Conclusion
Grid Service Broker (GSB) • A resource broker for scheduling task farming data Grid applications with static or dynamic parameter sweeps on global Grids. • It uses computational economy paradigm for optimal selection of computational and data services depending on their quality, cost, and availability, and users’ QoS requirements (deadline, budget, & T/C optimisation) • Key Features • A single window to manage & control experiment • Programmable Task Farming Engine • Resource Discovery and Resource Trading • Optimal Data Source Discovery • Scheduling & Predications • Generic Dispatcher & Grid Agents • Transportation of data & sharing of results • Accounting
Gridbus Broker Architecture Gridbus Client Gribus Client Gridbus Client (Bag of Tasks Applications) App, T, $, Opt (Data Grid Scheduler) Gridbus Farming Engine Schedule Advisor Trading Manager RecordKeeper Grid Dispatcher Grid Explorer Grid Middleware TM TS $ GE GIS, NWS Grid Info Server RM & TS G $ Data Catalog Data Node C $ U G Unicore enabled node. Globus enabled node. L A RM: Local Resource Manager, TS: Trade Server Alchemi enabled node.
Gridbus Services for eScience applications • Application Development Environment: • XML-based language for composition of task farming (legacy) applications as parameter sweep applications. • Task Farming APIs for new applications. • Web APIs (e.g., Portlets) for Grid portal development. • Threads-based Programming Interface • Workflow interface and Gridbus-enabled workflow engine. • Resource Allocation and Scheduling • Dynamic discovery of optional computational and data nodes that meet user QoS requirements. • Hide Low-Level Grid Middleware interfaces • Globus (v2, v4), SRB, Alchemi, Unicore, and ssh-based access to local/remote resources managed by XGrid, Condor, SGE.
Adaptive Scheduling Steps Discover More Resources Discover Resources Establish Rates Compose & Schedule Evaluate & Reschedule Meet requirements ? Remaining Jobs, Deadline, & Budget ? Distribute Jobs
Deadline (D) and Budget (B) Constrained Scheduling Algorithms
Agenda • Introduction • Utility Networks and Grid Computing • Application Drivers and Various Types of Grid Services • Global Grids and Challenges • Security, resource management, pricing models, … • Service Oriented Grids and Grid Economy • SOGA, Grid Market Directory, Grid Bank, Broker.. • Grid Service Broker • Architecture, Design and Implementation • Performance Evaluation: Experiments in Creation and Deployment of Applications on Global Grids • A Case Study in High Energy Physics • Summary and Conclusion
Case Study: High Energy Physics and Data Grid • The Belle Experiment • KEK B-Factory, Japan • Investigating fundamental violation of symmetry in nature (Charge Parity) which may help explain “why do we have more antimatter in the universe?”. • Collaboration 1000 people, 50 institutes • 100’s TB data currently
Case Study: Event Simulation and Analysis B0->D*+D*-Ks • Simulation and Analysis Package - Belle Analysis Software Framework (BASF) • Experiment in 2 parts – Generation of Simulated Data and Analysis of the distributed data Analyzed 100 data files (30MB each) that were distributed among the five nodes within Australian Belle DataGrid platform.
Australian Belle Data Grid Testbed VPACMelbourne
Belle Data Grid (GSP CPU Service Price: G$/sec) G$4 NA G$4 G$6 VPACMelbourne G$2 Datanode
Belle Data Grid (Bandwidth Price: G$/MB) 32 33 36 G$4 31 30 34 NA 38 31 G$4 G$6 VPACMelbourne G$2 Datanode
Deploying Application Scenario • A data grid scenario with 100 jobs and each accessing remote data of ~30MB • Deadline: 3hrs. • Budget: G$ 60K • Scheduling Optimisation Scenario: • Minimise Time • Minimise Cost • Results:
fleagle.ph.unimelb.edu.au belle.anu.edu.au belle.physics.usyd.edu.au brecca-2.vpac.org 80 70 60 50 Number of jobs completed 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 Time (in mins.) Time Minimization in Data Grids
fleagle.ph.unimelb.edu.au belle.anu.edu.au belle.physics.usyd.edu.au brecca-2.vpac.org 100 90 80 70 60 50 Number of jobs completed 40 30 20 10 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 Time(in mins.) Results : Cost Minimization in Data Grids
Agenda • Introduction • Utility Networks and Grid Computing • Application Drivers and Various Types of Grid Services • Global Grids and Challenges • Security, resource management, pricing models, … • Service Oriented Grids and Grid Economy • SOGA, Grid Market Directory, Grid Bank, Broker.. • Grid Service Broker • Architecture, Design and Implementation • Performance Evaluation: Experiments in Creation and Deployment of Applications on Global Grids • A Case Study in High Energy Physics • Summary and Conclusion