1 / 29

Virtual Grid Execution System: Research Plans & Updates

Explore research plans, achievements, and advancements of the Virtual Grid Execution System (vgES) for application-driven adaptation in complex grid resource environments.

Download Presentation

Virtual Grid Execution System: Research Plans & Updates

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. vgES Update and Research Plans Andrew A. Chien, Henri Casanova, Fran Berman, Yang-suk Kee, Ken Yocum, Richard Huang, Dionysis Logothetis, and Jerry ChouCSE, SDSC, and CNSUniversity of California, San Diego VGrADS Workshop Sept 12-13, 2005

  2. Program Preparation System Execution Environment Performance Feedback Real-time Performance Performance Problem Software Monitor Components Resource Config- Whole- Source Grid Negotiator urable Appli- Program Negotiation Runtime Object Compiler cation System Scheduler Program Binder Libraries Lessons from GrADS • Approach Vindicated: Application Driven Adaptation is Crucial • Doing this well is DIFFICULT • Specifically: • Implicit Coupling of Application, Programming Tools, and Runtime requires dealing with complexity of all levels simultaneously • Lack of Explicit Resource Abstraction inhibits expressing and exploiting application domain knowledge • Closed World Selection Model does not extend to larger, shared, competitive Grid resource environments

  3. Research Activities • vgES Selection • Scheduling • vgES-Pegasus • Finding & Binding • Rsc Mgrs & VG’s • vgMON • Fault Tolerance • Dynamic VG’s Virtual Grid Approach • Separation of Concerns • Application Planning and Management • Complex Grid Resource Environment Management => vgDL and Virtual Grid • Scalable Selection and Binding • Large Resource Pools • Competitive, Dynamic Environments => Finding and Binding • Application-Driven Resource Management • Application-level Abstraction • Grid Information => Virtual Grid Explicit Resource Abstraction

  4. Virtual Grid Abstraction Virtual Grid Execution System (vgES) Separation of Concerns: vgDL and VG Application / PPS • Virtual Grid Description Language (vgDL) • Applications Describe their resource Needs at Application-level Abstraction • Virtual Grid (VG) • Resources Selected, Bound, and Organized into Application-level Abstraction • Adaptive Applications (Future) • Monitoring, Management, and Modification of Virtual Grids vgDL Resource Description Virtual Grid Resources Virtual Grid Execution System (vgES) Complex Grid Resource Environment

  5. vgDL Virtual Grid Life Cycle of Virtual Grids • Life-cycle of a Virtual Grid • Application sends vgES a vgDL request • vgES (vgFAB) creates VG and returns to Application • Application Uses VG (runs jobs, reads resource attributes, gets notifications from monitors, adapts VG, eventually done) • Application terminates VG Application vgES

  6. Application vgDL VG vgES APIs Information Services VG vgFAB vgAgent VG VG vgMON vgLAUNCH Resource Managers DVCW Grid Resources System Architecture: vgES implements Virtual Grid

  7. Achievements to Date (Year 1) • Application Studies to understand Application Resource Abstractions • Design and Implementation of vgDL Language (Application-level Resource Abstraction) • Design and Implementation of Synthetic Resource Generator for Grids (Size, Time, etc.)

  8. Achievements to Date (Year 2) • Design and Implementation of Integrated “Finding and Binding” Algorithms • Simulation Experiments for “Finding and Binding” Effectiveness under Various Resource Environments • Design and Implementation of a Working Infrastructure • Release vgES 0.7, March 2005 • Realizes the Key VG Ideas • Enables Modular Exploration of Research Issues • Enables Experimentation with Large Applications • Leverages and Integrates with Globus/MDS/Production Grid Resource Infrastructures • Several other VGrADS Sites Using vgES • => Have Exposed Many Hard Research Questions!

  9. Leveraging vgES • Many at UCSD VGrADS • Others at UCSD Sysnet (PLUSH project), Jeannie, Amin, • Scheduling Experiments (U Hawaii), Henri • GridSolve (UTK), Asim, Jack • Pegasus Integration Evaluation (USC/ISI), Gurmeet, Ewa, Carl • Scheduling Experiments (Rice), Ryan, Anirban, Chuck, Ken • Anyone else? • On Deck • LEAD at UNC? • On grid resources at UH? Teragrid? • What do we need do to move this forward?

  10. Application-Driven Design of vgDL • vgDL design based on Extensive Grid Application Studies • EMAN: Single Particle Analysis and Electron Micrograph Analysis • Encyclopedia of Life (Bioinformatics) • LEAD: Linked Environments for Atmospheric Discovery • GridSAT: Boolean Satisfiability Solver (Logic and Test Design) • Findings • Small number of Resource Abstractions • Application Mapping and Resource Ignores Detail • vgDL Research Hypotheses… • A Simple Application-level Description is better for Applications! • Simplicity supports Description Portability and Robustness • Loose Specification enables “Finding and Binding”

  11. Separation of Concerns: Research Questions • vgDL Language • Does the language support good expression of application needs? (deep studies, breadth studies) • Does the language have clear meaning and reasonable syntax? • What constitutes a good realization of a VG for a vgDL request? (semantics) • VG Selection • What are good techniques for selecting resources for a particular vgDL description? • How robust are the descriptions across a range of resource environments? • VG Abstraction (utility,power) • Does the VG abstraction (hierarchy, aggregators, attribute system) enable application reasoning and management about resources? • Do the notification and attribute interfaces provide good access to dynamic information and filtering? • Can a reasonable class of applications achieve a separation of concerns with vgDL and VG?

  12. Two-level Scheduling and more… • VG’s Separation of Concerns limits visibility and choice available to an application scheduler • Obtain a Set of Resources for a Virtual Grid • Schedule the Tasks of a Workflow (or Application) on specific physical resources in Virtual Grid • Research Questions • How does the two-level scheduling affect performance? • Pros: Simpler set of resources to consider, offline optimization, reduced scheduling cost, binding cost • Cons: Smaller set of resource choices • How to generate an appropriate VG (vgDL specification) for a Workflow Graph? • Structure, Performance Model, Resource Estimates • Is it possible to design vgDL specifications and resulting VG’s so that very simple schedulers can perform nearly as well as sophisticated schedulers?

  13. Abstract DAX Controller Scheduler Planned DAX + VG Abstract workflow Pegasus DAGMan EXEC Resource Description Query Agent Pools/VG XMP-RPC Layer/Network vgES vgFAB vgLAUNCH Large-Scale Experiments: vgES-Pegasus Integration • Many Benefits • Pegasus: More flexible Resource Control and Management • Scheduling: 2-level Scheduling, Dynamic Scheduling, Large Experiments – workload and resources • Virtual Grid: vgDL workload and experiments, Large Experiments – workload and resources • Questions: What? Who? When? How? -- a Concrete Plan

  14. Scalable Selection and Binding • Traditional Model: Separate Selection and Binding • Select Resources; Plan Execution; Bind Resources; Execute • Doesn’t work for competitive Grid resource environments • Problem: • Selection Ignores Resource Competition • Binding May not Succeed • Research Questions • What algorithms or techniques work well for combined finding and binding? • Are distinct techniques required for each vgDL construct or idiom? • How does the level of contention (binding success rate) influence the effectiveness of these techniques? Relative? Absolute? • How does the structure of vgDL requests influence the effectiveness of these techniques? Relative? Absolute?

  15. Virtual Grid vgFAB: “Finding and Binding” g1 = { rsc1 = LooseBagOf(c1)[2:4] { c1 = ClusterOf(node1)[4:8] { node1 = [ (Processor == Pentium4) && (Memory>=4096)]}} FAR rsc2 = LooseBagOf(tb1)[2:4] { tb1 = TightBagOf(node2)[4:8] { node2 = [ Clock>=2.048 ]}}} FAR rsc3 = LooseBagOf(c2)[2:4] { c2 = ClusterOf(node3)[4:8] { node3 = [ Processor == Pentium4 ] [ Rank = Memory ]}} • Use vgDL (and rank) to enumerate a number of candidates for each part of request (Overselection) • Attempt to bind candidates for each part based on vgDL ranking • Compose successfully bound parts into a VG and returns to application (Dynamic Composition) • If didn’t succeed for all parts, try iteratively with more candidates or fail Successfully Bound Candidates

  16. vgFAB enables Synchronous Resource Use in Competitive Environments • FAB Scalable and Satisfies Complex Requests in Competitive Resource Environments • Combined Better in All Cases; Much Better in Competitive Environments • Tolerates Double Binding Failure Rates => Makes Synchronous Use Practical • Separate: 30% for 8 and 16 component descriptions • Combined: 60-70% for 8 and 16 component descriptions 16 candidates

  17. horizon.sdsc.edu AIX/LoadLeveler saxicolous.sdsc.edu Linux/PBS morpheus.engin.umich.edu Linux/PBS {multivac/nbcr3/nbcr4/nbcr5/nbcr6}.sdsc.edu Solaris Resource Managers and VG’s • Resource Manager Diversity is a Reality • Dedicated, Volatile/Desktop, Timeshared, Sliceshared, Spaceshared • Immediate, Queued, Advance Reservation • Research Questions • How can we support vgDL features? (now, synchronous, @time) • Over what RMs and how can we construct virtual grids? (co-allocation…) • How do these capabilities influence what assumptions applications (and planners/schedulers) can make about vgDL? Breakout?

  18. Application-Driven Resource Management • Virtual Grid Provides an Application-Level Resource Abstraction • Structure of VG <-> Available Resource Structure <-> Application’s vgDL • Presents Information and Enables Use • VG Organization for Resource Information • VG Organization for Computation Launching, Data Movement, and Monitoring • Enables Monitoring and Modification of Available Resources • vgMON Customizable Monitoring; Follows VG/vgDL structure • Operations to Augment or Prune a Virtual Grid • We have a lot of mechanism here, but not much policy or understanding of how to use.

  19. Application-Driven Resource Management (cont.) • Research Questions • Are VG interfaces a good way to present static and dynamic information to applications? • Do VG interfaces enable convenient and effective modification of VGs (and thereby the available resources)? • What properties do applications want to monitor on their resources? • Can vgMON provide efficient (low overhead) monitoring of desired properties with few false positives? • And of course, the Rescheduling Question • How would an application use these features to adapt intelligently in the face of resource performance change? Resource failure?

  20. Rname: foo.ucsd.edu Load: 0.6 Pred Load: 2.0 … Virtual Grid: Explicit Resource Abstraction • vgDL Request Creates Virtual Grid (VG) • Virtual Grid (VG) is an Explicit Entity • Each VG Resource corresponds to part of the Application’s vgDL • VG Nodes Attributes present Resource Information • Static Information (proc type, speed, location, etc.) • Dynamic Information (load, mem, uptime, prediction, NWS, Ganglia, etc.) • Characterization / Classification Information BagOfClusters= LooseBagOf(N)[10:100] [Rank=LooseBag.Nodes] {N=ClusterOf(M)[8:32] {M=[(Memory>=1024) &&(Disk>2048)] [Rank = Clock]} } Virtual Grid (VG) vgDL Request

  21. vgMON and Monitoring • Couples Applications and Resource Requirements (expectations) to Dynamic Environment • Simple language for expressing expectations • Query across time or tuple-based windows (streaming databases) • Language supports aggregate functions: sum, min, or max • Build tree-based overlays (sensor networks) • Long-lived queries/expectations • Installed across monitored nodes (distributed XML query processing) • Continues vgES philosophy • Thin, lightweight, modular layer • Research Questions • Can eDL capture application expectations? What are typical application expectations? • How to cope with scale and heterogeneity of VG resources? • How to trigger effectively to enable Application-level decision-making? • How to filter/organize information to support good Application-level decisions

  22. vgDL Virtual Grid Fault Tolerance and Virtual Grids • Application Creates a Virtual Grid and Uses it for some time • Receives Monitoring notifications • Eventually one is significant enough to cause the application to decide to… • Modify Virtual Grid • Based on Notification, Determine which Resources are Implicated • Remore or Replace those that are Failing or Failure Prone or simply Low Performing • Meet Evolving Needs Application vgES

  23. vgES Research with UCSD • Separation of Concerns • Grid Resource Generation (YSK) • Scalable, Robust, Accurate Large-scale Grids (now and future) • Integration of vgES and Pegasus (YSK,RYH,…) • With Gurmeet, Ewa, Carl @ ISI • With Anirban, Ken @ Rice • Application Scheduling and Virtual Grids (RYH,YSK, HC) • With Ryan, Chuck, Ken @ Rice

  24. vgES Research with UCSD (cont.) • Scalable Selection and Binding • Finding and Binding (YSK) • Good Selection, Clear Semantics • Resource Contention and Successful VG Construction • Resource Management Models and Virtual Grids (JC) • Batch, Adv Res, Timesharing, Auction, Best Effort, Desktop Grid • VG Construction • Statistical Guarantees with RW @ UCSB

  25. vgES Research with UCSD (cont.) • Application-Driven Resource Management (Dynamic Virtual Grids) • Monitoring (KY,DL) • With Rich,Graziano @ UCSB • Fault Tolerance and Adaptation • With Lavanya, Dan @ UNC, Asim, Jack @ UTK

  26. Future Achievements? (Year 3+) • Separation of Concerns • Understand Two-level Benefits/Disadvantages • Understand how to Schedule Apps on VG’s, how to specify VG’s • Large-scale Experiments on Real Resources • Scalable Selection and Binding • Understand the limits of Constructing VG’s on Shared Resources • Understand how to Construct VG’s with real Resource Management Systems • Large-scale Experiments on Real Resources • Application-Driven Resource Management • Understand how to Efficiently Monitor for “custom” VG’s and Properties • Understand Monitoring Requirements of Workflow Applications • Tie Monitoring to Management of VG’s to Optimize Performance and Tolerate Failures • All VGrADS Sites using vgES!

  27. vgES Plans • vgES 1.0: vgES’ + vgMON Release (Nov 2005) • Smaller Changes • Client-server vgES (support vgES-Pegasus Integration) • vgES-Pegasus Integration: Team Deadlines • Operate with Separate Resource Managers + Private IP’s • Major Changes • Dynamic Virtual Grids • Operate on Variety of Resource Managers

  28. Research Activities • vgES Selection • Scheduling • vgES-Pegasus • Finding & Binding • Rsc Mgrs & VG’s • vgMON • Fault Tolerance • Dynamic VG’s Virtual Grid Approach • Separation of Concerns • Application Planning and Management • Complex Grid Resource Environment Management => vgDL and Virtual Grid • Scalable Selection and Binding • Large Resource Pools • Competitive, Dynamic Environments => Finding and Binding • Application-Driven Resource Management • Application-level Abstraction • Grid Information => Virtual Grid Explicit Resource Abstraction

  29. For More Information • Andrew A. Chien, Henri Casanova, Yang-Suk Kee, Richard Huang, Dionysis Logothetis, and Ken Yocum, The Virtual Grid Description Language: vgDL, UCSD Technical Report CS2005-0817. And Update to The Virtual Grid Description Language: vgDL, Version 0.96, March 16, 2005. • Yang-Suk Kee, Dionysios Logothetis, Richard Huang, Henri Casanova, and Andrew A. Chien, Efficient Resource Description and High Quality Selection for Virtual Grids, In Proceedings of the IEEE Conference on Cluster Computing and the Grid (CCGrid 2005) • Yang-Suk Kee, Henri Casanova, and Andrew A. Chien, Realistic Modeling and Synthesis of Resources for Computational Grids, In Proceedings of the ACM Conference on High Performance Computing and Networking, SC2004, Pittsburgh , Pennsylvania, November 2004 • Yang-suk Kee, Henri Casanova, and Andrew A. Chien,Combined Selection and Binding for Competitive Resource Environments, submitted for publication.

More Related