1 / 6

Resource Management in Data-Intensive Systems

Resource Management in Data-Intensive Systems. Bernie Acs , Magda Balazinska , John Ford, Karthik Kambatla , Alex Labrinidis , Carlos Maltzahn , Rami Melhem , Paul Nowoczynski , Matthew Woitaszek , Mazin Yousif. Resource Utilization Problem. Resource Management Perspectives User:

aure
Download Presentation

Resource Management in Data-Intensive Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Resource Management in Data-Intensive Systems Bernie Acs, MagdaBalazinska, John Ford, KarthikKambatla, Alex Labrinidis, Carlos Maltzahn, RamiMelhem, Paul Nowoczynski, Matthew Woitaszek, MazinYousif

  2. Resource Utilization Problem • Resource Management Perspectives • User: • Application performance, cost, • QoS (deadlines for interactivity) • Need metering tools, job description language(e.g. JDL - developed in grid computing) • Provider: • Power, physical space • Network bandwidth, memory, CPU power, • Disk I/O, space, • Cost of metering

  3. Resource Utilization Problem (cont’d) • Overall Management Goals of Provider • Most efficient allocation of resources to meet service level agreements • Pricing model that drives users towards more efficient/predictable usage • Maintain a certain envelope of resource utilization • Difference to conventional super computing centers: • Not only cores but network bandwidth, memory, disk • Scheduling preference based on data locality

  4. Common Challenges • What should be guaranteed? • Example: SimpleDB returns whatever can be retrieved in 5s. Not applicable for science applications • Network bandwidth, storage throughput • Management of Resources: Hardware • 3-4 year cycle, 20%/year • Resource discovery • Mapping optimized to user demand: • Upgrade based mapping history • Requires workload profiles -> elastic clustering, virtualization essential, applications servers • Managmenet of Resources: Centralized Services/Software • Big databases • Visualization • Virtualization: as a packaging and delivery service (Testing/staging environment) Licensing, • Applications (Hadoop, R, …)

  5. Hard Problems • Failure & Recovery Resource Management • Cannot prevent, but estimate, over-provisioning • What level of failure protection is adequate? • Creeping failures • Real-time triage: extra cost -> often sampling only • Possible benefit: smaller set of libraries/apps • Two-tier approach? • Combined with security and other safety mechanisms • Interactivity (Paradigm shift for batch environment) • Def: want to see what is happening right now, or in regular intervals • Intelligent placement of data • Reserve resources -> over-provisioning/waste • Different scheduling time scale: seconds to minutes vs ms • SLAs for DIC workloads • Incorporating Power • Framework of SLAs for Science different than for commercial • Not clear whether that’s an agreement or optimization thing

  6. Hard Problems (cont’d) • Provisioning Framework • DIC application -> what resources am I going to need? • Hadoop friendly science applications • DIC framework configuration to adapt to user & HW profiles • Performance Management • Granularity of Prediction (if predictable) • Co-location of workloads for efficiency • Real-time end-to-end scheduling (sometime too costly) • Metrics, instrumentation • Blackboxvs grey vs transparent box alternatives

More Related