1 / 34

GRUBER: A Grid Resource Usage SLA Broker

Explore the architecture and toolkit for enforcing service level agreements (SLAs) in grid environments, with implementations for GT3 and GT4. Evaluate novel capabilities and research problems addressed by GRUBER.

mbock
Download Presentation

GRUBER: A Grid Resource Usage SLA Broker

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GRUBER: A Grid Resource Usage SLA Broker Catalin L. Dumitrescu The University of Chicago Ian Foster Argonne National Laboratory & The University of Chicago

  2. Introduction • Large distributed Grid systems pose new challenges • Overwhelming resource characteristics • Complex workload characteristics • Complex interactions and resource allocations • Automated resource discovery and usage SLA enforcement represent important elements GRUBER: A Grid Resource Usage SLA Broker

  3. Talk Outline / Part I • Part I: • Introduction • Our Approach: GRUBER • Motivating Scenarios • Architecture • Part II: • Evaluation Metrics • Experimental Results • Conclusions and Questions GRUBER: A Grid Resource Usage SLA Broker

  4. Our Approach: GRUBER • GRUBER: an architecture and toolkit for resource usage service level agreement (SLA) specification and enforcement in a grid environment • GT3 and GT4 based implementations • Able to handle as many clients (submission hosts) as the GTx container’s performance permits GRUBER: A Grid Resource Usage SLA Broker

  5. A bit of History • Started in the context of Grid3 as monitoring engine • Evolved in a simple site recommendation engine • Later where added additional capabilities such as: • Enforcement components • Complex Usage SLAs and specification interfaces GRUBER: A Grid Resource Usage SLA Broker

  6. GRUBER Novelty • Handles: • Sites with RMs • VO and groups • Submission hosts • Model usage allocations (SLAs) at several levels • Capacity: • to collect monitoring metrics from a grid • to make various decisions based on this information • To enforce complex SLAs by various means GRUBER: A Grid Resource Usage SLA Broker

  7. Environment Overview GRUBER: A Grid Resource Usage SLA Broker

  8. Environment Details • Target environments with: • large number of resources • resource owners • VOs where usage SLAs are required to handle resource utilisations • A few examples are: • Grid3 • OSG • TeraGrid • DataGrid GRUBER: A Grid Resource Usage SLA Broker

  9. Research Problems • “How usage SLAs are handled in grid environments?” • What is the gain for taking in account such usage SLAs?” GRUBER: A Grid Resource Usage SLA Broker

  10. Motivating Scenario • Controlled resource sharing is important because each participant wants to ensure that its goals are achieved • Three dimensions in the usage policy space: • resource providers (sites, VOs, groups) • resource consumers (VOs, groups, users), • time. • Provider policies make resources available to consumers for specified time periods. GRUBER: A Grid Resource Usage SLA Broker

  11. Main Players & Elements • Owners: want convenient and flexible mechanisms for expressing the policies that determine how many resources are allocated to different purposes • User and group jobs: are the main interested parties in resources provided by sites and resources • Algorithms and policies: capture how jobs are assigned to host machines GRUBER: A Grid Resource Usage SLA Broker

  12. Problem Domain • A grid consists of: • a set of resource provider sites:each contains a number of processors and some amount of disk space • a three-level hierarchy of users, groups, and VOs:each user is a member of exactly one group, and each group is member to exactly one VO • a set of submithosts and jobs: specified by four attributes: VO, Group, Required-Processor-Time, Required-Disk-space GRUBER: A Grid Resource Usage SLA Broker

  13. Problem Domain – cont. • A grid consists of (cont.): • Usage SLAs: • site policy statement: defines site usage SLAs by specifying the number of processors and amount of disk space that sites make available to different VOs; • VO policy statement: defines VO usage SLAs by specifying the fraction of the VO’s total processor and disk resources (i.e., the aggregate of contributions to that VO from all sites) that the VO makes available to different groups. GRUBER: A Grid Resource Usage SLA Broker

  14. GRUBER Architecture • Engine: implements various algorithms for detecting available resources and maintains a generic view of resource utilization in the grid • Site monitoring component:is one of the data providers for the GRUBER engine • Site selectors:are tools that communicate with the GRUBER engine and provide answers to the question: “which is the best site at which I can run this job?” • Queue manager:is a complex GRUBER client that must reside on a submitting host GRUBER: A Grid Resource Usage SLA Broker

  15. GRUBER Picture GRUBER: A Grid Resource Usage SLA Broker

  16. GRUBER Engine • If fewer waiting jobs at a site than available CPUs, then GRUBER assumes the job will start right away if an extensible usage policy is in place • If more waiting jobs than available CPUs or if an extensible SLA is not in place, then it considers: • if the VO is under its allocation, GRUBER assumes that a new job can be started (in a time that depends on the local resource manager type) • if the VO is over its allocation, GRUBER assumes that a new job cannot be started (the running time is unknown for the jobs already running) GRUBER: A Grid Resource Usage SLA Broker

  17. GRUBER QM/SiteSel • QM is responsible for determining how many jobs per VO or VO group can be scheduled at a certain moment in time and when to release them • Job assignment and enforcement components are part of GRUBER • The site selector component answers: “Where is best to run next?”, while the queue manager answers: “How many jobs should group Gm of VOn V be allowed to run?” and “When to start these jobs?” GRUBER: A Grid Resource Usage SLA Broker

  18. Disk Space Considerations • Introduces additional complexities • A file that has been staged to a site cannot be “delayed,” it can only be deleted. Yet deleting a file that has been staged for a job can result in livelock, if a job’s files are repeatedly deleted before the job runs • So far, we have considered a UNIX quota-like approach GRUBER: A Grid Resource Usage SLA Broker

  19. Usage SLA Language • Based on Maui’s semantics and WS-Agreement syntax • Allocations are made for processor time, permanent storage, or network bandwidth resources, and there are at least two-levels of resource assignments: to a VO, by a resource owner, and to a VO user or group, by a VO. • e.g., VO0 15.5, VO1 10.0+, VO2 5.0-. GRUBER: A Grid Resource Usage SLA Broker

  20. Screenshot: Site Selection GRUBER: A Grid Resource Usage SLA Broker

  21. Screenshot: VO Usage SLA GRUBER: A Grid Resource Usage SLA Broker

  22. Screenshot: VO Verifier GRUBER: A Grid Resource Usage SLA Broker

  23. Talk Outline / Part II • Part I: • Introduction • Our Approach: GRUBER • Motivating Scenarios • Architecture • Part II: • Evaluation Metrics • Experimental Results • Conclusions and Questions GRUBER: A Grid Resource Usage SLA Broker

  24. Evaluation Metrics • Comp: percentage of jobs completed successfully • Replan: number of re-planning operations • Time: total execution time for the workload • Util: average resource utilization: Util = Σ i=1..N ETi / (#cpus * Δt) * 100.00 • Delay is average time per job: Delay = Σi=1..N DTi / #jobs GRUBER: A Grid Resource Usage SLA Broker

  25. Experimental Settings • A single job type in all experiments: the sequence analysis program BLAST • A single BLAST job has: • execution time of about an hour • about 10-33 kilobytes of input reads • about 0.7-1.5 megabytes of output • Various configurations: • 1x1K: 1000 independent BLAST jobs • 4x1K: the 1x1K workload is run in parallel from four hosts • each job can be re-planed at most four times GRUBER: A Grid Resource Usage SLA Broker

  26. Experimental Environment • All experiments on Grid3 (December 2004) • Comprises around 30 sites across the U.S., of which we used 15 • Each site is autonomous and managed by different local resource managers, such as Condor, PBS, and LSF • Each site enforces different usage policies which are collected by our site SLA observation point and used in scheduling workloads GRUBER: A Grid Resource Usage SLA Broker

  27. Results Least Used Site Assignment Policy GRUBER: A Grid Resource Usage SLA Broker

  28. 4x1k – Completion vs. Time GRUBER: A Grid Resource Usage SLA Broker

  29. Result’s Variance GRUBER: A Grid Resource Usage SLA Broker

  30. SiteSel Comparisons GRUBER: A Grid Resource Usage SLA Broker

  31. Related Work • Fair share scheduling strategies developed for mainframes • SHARP • SPHINX • CREMONA GRUBER: A Grid Resource Usage SLA Broker

  32. Conclusions about GRUBER • the experiments we performed with several approaches in task assignment policies showed initial GRUBER performance in scheduling jobs • GRUBER is an architecture and toolkit for resource usage SLAs specification and enforcement in a grid-like environment • Open Problems: • over-subscribed local resources, in the sense of a local policy that states that 40% of the local CPU power is available to VO1 and 80% is available to VO2 • hierarchic grouping and allocation of resources based on policy GRUBER: A Grid Resource Usage SLA Broker

  33. Addressed Questions • “How usage SLAs are handled in grid environments?” • What is the gain for taking in account such usage SLAs?” GRUBER: A Grid Resource Usage SLA Broker

  34. Thanks Questions? GRUBER: A Grid Resource Usage SLA Broker

More Related