Cloud Computing

Presenters: Abhishek Verma, Nicolas Zea Cloud Computing - I

Cloud Computing • Map Reduce • Clean abstraction • Extremely rigid 2 stage group-by aggregation • Code reuse and maintenance difficult • Google → MapReduce, Sawzall • Yahoo → Hadoop, Pig Latin • Microsoft → Dryad, DryadLINQ • Improving MapReduce in heterogeneous environment

k1 k1 v1 v1 k1 v1 k1 k2 v3 v2 k1 v3 k2 k1 v3 v2 k1 v5 k2 v2 k1 k2 v5 v4 k2 v4 k2 k1 v4 v5 MapReduce: A group-by-aggregate Input records Output records map reduce Split Local QSort reduce map Split shuffle

Shortcomings • Extremely rigid data flow • Other flows hacked in Stages Joins Splits • Common operations must be coded by hand • Join, filter, projection, aggregates, sorting,distinct • Semantics hidden inside map-reduce fns • Difficult to maintain, extend, and optimize M R M R M R

Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins Pig Latin: A Not-So-Foreign Language for Data Processing Research

Pig Philosophy • Pigs Eat Anything • Can operate on data w/o metadata : relational, nested, or unstructured. • Pigs Live Anywhere • Not tied to one particular parallel framework • Pigs Are Domestic Animals • Designed to be easily controlled and modified by its users. • UDFs : transformation functions, aggregates, grouping functions, and conditionals. • Pigs Fly • Processes data quickly(?)‏

Features • Dataflow language • Procedural : different from SQL • Quick Start and Interoperability • Nested Data Model • UDFs as First-Class Citizens • Parallelism Required • Debugging Environment

Pig Latin • Data Model • Atom : 'cs' • Tuple: ('cs', 'ece', 'ee')‏ • Bag: { ('cs', 'ece'), ('cs')} • Map: [ 'courses' → { ('523', '525', '599'}] • Expressions • Fields by position $0 • Fields by name f1, • Map Lookup #

URL Category PageRank cnn.com News 0.9 bbc.com News 0.8 flickr.com Photos 0.7 espn.com Sports 0.9 Example Data Analysis Task Find the top 10 most visited pages in each category Visits URL Info

Data Flow Load Visits Group by url Foreachurl generate count Load Url Info Join on url Group by category Foreachcategory generate top10 urls

In Pig Latin visits = load ‘/data/visits’ as (user, url, time); gVisits = group visits byurl; visitCounts = foreachgVisitsgenerateurl, count(visits); urlInfo = load‘/data/urlInfo’ as (url, category,pRank); visitCounts = join visitCountsbyurl, urlInfobyurl; gCategories = groupvisitCountsby category; topUrls = foreachgCategories generatetop(visitCounts,10); storetopUrlsinto‘/data/topUrls’;

Quick Start and Interoperability visits = load ‘/data/visits’ as (user, url, time); gVisits = group visits byurl; visitCounts = foreachgVisitsgenerateurl, count(visits); urlInfo = load‘/data/urlInfo’ as (url, category,pRank); visitCounts = join visitCountsbyurl, urlInfobyurl; gCategories = groupvisitCountsby category; topUrls = foreachgCategories generatetop(visitCounts,10); storetopUrlsinto‘/data/topUrls’; Operates directly over files

Optional Schemas visits = load ‘/data/visits’ as (user, url, time); gVisits = group visits byurl; visitCounts = foreachgVisitsgenerateurl, count(visits); urlInfo = load‘/data/urlInfo’ as (url, category,pRank); visitCounts = join visitCountsbyurl, urlInfobyurl; gCategories = groupvisitCountsby category; topUrls = foreachgCategories generatetop(visitCounts,10); storetopUrlsinto‘/data/topUrls’; Schemas 0ptional can be assigned dynamically

UDFs as First-class citizens visits = load ‘/data/visits’ as (user, url, time); gVisits = group visits byurl; visitCounts = foreachgVisitsgenerateurl, count(visits); urlInfo = load‘/data/urlInfo’ as (url, category,pRank); visitCounts = join visitCountsbyurl, urlInfobyurl; gCategories = groupvisitCountsby category; topUrls = foreachgCategories generatetop(visitCounts,10); storetopUrlsinto‘/data/topUrls’; UDFs can be used in every construct

Operators • LOAD: specifying input data • FOREACH: per-tuple processing • FLATTEN: eliminate nesting • FILTER: discarding unwanted data • COGROUP: getting related data together • GROUP, JOIN • STORE: asking for output • Other: UNION, CROSS, ORDER, DISTINCT

COGROUP Vs JOIN

Compilation into MapReduce Every group or join operation forms a map-reduce boundary Map1 Load Visits Group by url Reduce1 Map2 Foreachurl generate count Load Url Info Join on url Reduce2 Map3 Other operations pipelined into map and reduce phases Group by category Reduce3 Foreachcategory generate top10 urls

Debugging Environment • Write-run-debug cycle • Sandbox dataset • Objectives: • Realism • Conciseness • Completeness • Problems: • UDFs

Future Work • Optional “safe” query optimizer • Performs only high-confidence rewrites • User interface • Boxes and arrows UI • Promote collaboration, sharing code fragments and UDFs • Tight integration with a scripting language • Use loops, conditionals of host language

Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Ulfar Erlingsson, Pradeep Kumar Gunda, Jon Currey DryadLINQ: A System for General Purpose Distributed Data-Parallel Computing Using a High-Level Language

Dryad System Architecture data plane Files, TCP, FIFO, Network job schedule V V V NS PD PD PD control plane Job manager cluster

LINQ Collection<T> collection; bool IsLegal(Key); string Hash(Key); var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value};

DryadLINQ Constructs C# objects Partition • Partitioning: Hash, Range, RoundRobin • Apply, Fork • Hints Collection

Dryad + LINQ = DryadLINQ Collection<T> collection; boolIsLegal(Key k); string Hash(Key); var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value}; Vertexcode Queryplan (Dryad job) Data collection C# C# C# C# results

DryadLINQ Execution Overview Client machine DryadLINQ C# Data center Distributed query plan Invoke Query Expr Query ToDryadTable Input Tables JM Dryad Execution Output DryadTable C# Objects Results Output Tables (11) foreach

System Implementation • LINQ expressions converted to execution plan graph (EPG) • similar to database query plan • DAG • annotated with metadata properties • EPG is skeleton of Dryad DFG • as long as native operations are used, properties can propagate helping optimization

Static Optimizations • Pipelining • Multiple operations in a single process • Removing redundancy • Eager Aggregation • Move aggregations in front of partitionings • I/O Reduction • Try to use TCP and in-memory FIFO instead of disk space

Dynamic Optimizations • As information from job becomes available, mutate execution graph • Dataset size based decisions • Intelligent partitioning of data

Dynamic Optimizations • Aggregation can turn into tree to improve I/O based on locality • Example if part of computation is done locally, then aggregated before being sent across network

Evaluation • TeraSort - scalability • 240 computer cluster of 2.6Ghz dual core AMD Opterons • Sort 10 billion 100-byte records on 10-byte key • Each computer stores 3.87 GBs

Evaluation • DryadLINQ vs Dryad - SkyServer • Dryad is hand optimized • No dynamic optimization overhead • DryadLINQ is 10% native code

Main Benefits • High level and data type transparent • Automatic optimization friendly • Manual optimizations using Apply operator • Leverage any system running LINQ framework • Support for interacting with SQL databases • Single computer debugging made easy • Strong typing, narrow interface • Deterministic replay execution

Discussion • Dynamic optimizations appear data intensive • What kind of overhead? • EPG analysis overhead -> high latency • No real comparison with other systems • Progress tracking is difficult • No speculation • Will Solid State Drives diminish advantages of MapReduce? • Why not use Parallel Databases? • MapReduce Vs Dryad • How different from Sawzall and Pig?

Comparison

Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University of California at Berkeley Improving MapReduce Performance in Heterogeneous Environments

Hadoop Speculative Execution Overview • Speculative tasks executed only if no failed or waiting avail. • Notion of progress • 3 phases of execution • Copy phase • Sort phase • Reduce phase • Each phase weighted by % data processed • Determines whether a job failed or is a straggler and available for speculation

Hadoop’s Assumptions • Nodes can perform work at exactly the same rate • Tasks progress at a constant rate throughout time • There is no cost to launching a speculative task on an idle node • The three phases of execution take approximately same time • Tasks with a low progress score are stragglers • Maps and Reduces require roughly the same amount of work

Breaking Down the Assumptions • Virtualization breaks down homogeneity • Amazon EC2 - multiple vm’s on same physical host • Compete for memory/network bandwidth • Ex: two map tasks can compete for disk bandwidth, causing one to be a straggler

Breaking Down the Assumptions • Progress threshold in Hadoop is fixed and assumes low progress = faulty node • Too Many speculative tasks executed • Speculative execution can harm running tasks

Breaking Down the Assumptions • Task’s phases are not equal • Copy phase typically the most expensive due to network communication cost • Causes rapid jump from 1/3 progress to 1 of many tasks, creating fake stragglers • Real stragglers get usurped • Unnecessary copying due to fake stragglers • Progress score means anything with >80% never speculatively executed

LATE Scheduler • Longest Approximate Time to End • Primary assumption: best task to execute is the one that finishes furthest into the future • Secondary: tasks make progress at approx. constant rate • Progress Rate = ProgressScore/T* • T = time task has run for • Time to completion = (1-ProgressScore)/T

LATE Scheduler • Launch speculative jobs on fast nodes • best chance to overcome straggler vs using first available node • Cap on total number of speculative tasks • ‘Slowness’ minimum threshold • Does not take into account data locality

Performance Comparison Without Stragglers • EC2 test cluster • 1.0-1.2 Ghz Opteron/Xeon w/1.7 GB mem Sort

Performance Comparison With Stragglers • Manually slowed down 8 VM’s with background processes Sort

Performance Comparison With Stragglers WordCount Grep

Sensitivity

Takeaways • Make decisions early • Use finishing times • Nodes are not equal • Resources are precious

Further questions • Focusing work on small vm’s fair? • Would it be better to pay for large vm and implement system with more customized control? • Could this be used in other systems? • Progress tracking is key • Is this a fundamental contribution? Or just an optimization? • “Good” research?

Cloud Computing - I

Cloud Computing - I

Presentation Transcript

Cloud Computing

CLOUD COMPUTING

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

Cloud Computing

CLOUD COMPUTING

Cloud Computing

Cloud Computing