Papers on Storage Systems

Papers on Storage Systems 1) Purlieus: Locality-aware Resource Allocation for MapReduce in a Cloud, SC 2011. 2) Making Cloud Intermediate Data Fault-Tolerant, SOCC 2010. Present by: Qiangju Xiao

Purlieus: Locality-aware Resource Allocation forMapReduce in a CloudSC ’11, 2011 Authors: BalajiPalanisamy Aameek Singh Ling Liu Bhushan Jain

Introduction (1) • What does the paper present? • This paper designed Purlieus, a MapReduce resource allocation system aimed to enhance the performance of MapReduce jobs in the cloud. • How does Purlieus work? • Provision virtual MapReduce clusters in a locality-aware manner; • Enable MapReduce VMs access to input data (Map Phase) and intermediate data (Reduce phase) from local or close-by physical machines

Introduction (2) • What are the improvements for Purlieus? • Reduces cumulative data center network traffic; • 50% reduction in job execution times for a variety of workloads because network transfer times are big components of total execution time

Impact of Reduce Locality

System Model (1) –Current Cloud Infrastructure Data Load

System Model (2) –Purlieus Infrastructure Data is broken into chunks Blocks stored on distributed file system of the physical machines VM access data on physical machines

System Model (3) – Dataflow from physical to virtual machines

Two Key Questions • Data Placement • Which physical machines should be used for each dataset? • VM Placement • Where should the VMs be provisioned to process these data blocks?

Purlieus’ Solution –Principles (1) • Job Specific Locality-awareness • Placing data in the MapReduce cloud service should incorporate job characteristics like the amount of data accessed in the map and reduce phases. • Three distinct classes of jobs – (1) Map-input heavy; (2) Map-and-Reduce heavy; (3) Reduce-input-heavy.

Purlieus’ Solution –Principles (2) • Load Awareness • Placing data in a MapReduce cloud should also account for computational load (CPU, memory ) on the physical machines. • Ensure that the expected load on the servers does not exceed a configurable threshold.

Purlieus’ Solution –Principles (3) • Job-specific Data Replication • Replicas of the data set are placed based on the type and frequency of jobs. For example, if an input dataset is used by three sets of MapReduce jobs, two of which are reduce-input heavy and one map-input heavy, Purlieus places two replicas of data blocks in a reduce-input heavy fashion and the third one using map-input heavy strategy.

Purlieus – Placement Techniques (1) • Map-input heavy jobs • Data placement • Do not require reducers to be executed close to each other; • Purlieus chooses machines that have the least expected load. • VM placement • Attempt to place VMs on the physical machines that contain the input data chunks for the map phase; if those machines have lower expected computational load, the VM may be placed close to the node that stores the actual data chunk. • Among the physical machines at a same network distance, the one having the least load is chosen.

Purlieus – Placement Techniques (2) • Map and Reduce-input heavy jobs • Data Placement • Should support reduce-locality – VMs should be machines close to each other; • Data blocks get placed in a set of closely connected physical machines. • VM placement • Ensure that VMs get placed on either the physical machines storing the input data or the close-by ones. • Map tasks use local reads and reduce tasks also read within the same rack, maximizing the reduce locality

Purlieus – Placement Techniques (3) • Reduce-input heavy jobs • Data Placement • Map-locality is not so important; • Chooses the physical machine with maximum free storage • VM placement • Network traffic for transferring intermediate data among MapReduce VMs is intense in reduce-input heavy jobs and hence the set of VMs for the job should be placed close to each other.

Experiments • Data Placement Techniques • Purlieus proposed locality and load-aware data placement (LLADP) • Random data placement (RDP) • VM placement techniques: • Locality-unaware VMPlacement(LUAVP) • Map-locality aware VM placement (MLVP) • Reduce-locality aware VM placement (RLVP) • Map and Reduce-locality aware VM placement (MRLVP) • Hybrid locality-aware VM placement (HLVP): Our proposed HLVP technique adaptively picks the placement strategy based on type of the input job. It uses MLVP for map-input heavy, RLVP for reduce-input heavy jobs and MRLVP for map and reduce-input heavy jobs.

Results – Map and Reduce-input heavy workload

Results – Map-input heavy workload

Results – Reduce-input heavy workload

Results – Macro analysis using MapReduce simulator, PurSim (1)

Results – Macro analysis using MapReduce simulator, PurSim (2)

Conclusions • Purlieus’ proposed placement techniques optimize for data locality during both map and reduce phases of the job by considering VM placement, MapReduce job characteristics and load on the physical cloud infrastructure at the time of data placement. • Purlieus’ evaluation shows significant performance gains with some scenarios showing close to 50% reduction in the cross-rack network traffic.

Making Cloud Intermediate Data Fault-TolerantSOCC 2010 • Authors: • Steven Y. Ko • ImranulHoque • Brian Cho • Indranil Gupta

MapReduce • Phases • Map • Shuffle • Reduce • Data • Input • Intermediate • Output

Intermediate Data • Short lived • Used immediately • Discarded on • completion • Write once/ • Read bounded • Large • Many blocks

Intermediate Data –Failures Cascaded re-execution

Intermediate Data Loss requires recomputation

Intermediate Data – Behavior breakdown 0f-10min 1f-30sec

Intermediate Data –Repliation • Traditional replication expensive

Can replication be accomplished without significantly affecting execution speed?

Extend HDFS • Asynchronous replication • Replicate within rack • Minimize replicated data

Asynchronous Replication • HDFS Replication usually pessimistic • Blocks until replicas made • Do not block (Async) • Consistency loss not problem - only one writer

Asynchronous Replication

Replicate within Rack • HDFS replicates to a different rack for greater availability • Lifespan of intermediate data short • “Safe” to replicate to machine in same rack

Replicate within Rack

Minimize Data Replicated • HDFS replication • Shuffle phase replicates most data as side effect • Only data used locally is not copied • ISS • Replicate only local data

Minimize Data Replicated

IIS under failure

Conclusion • Intermediate data properties allow a tailored replication strategy to outperform a traditional one • Replication improves MapReduce performance in the case of failure

References 1) BalajiPalanisamy, Aameek Singh, Ling Liu, Bhushan Jain; Purlieus: Locality-aware Resource Allocation for MapReduce in a Cloud, SC 2011. 2) Steven Y. Ko, ImranulHoque, Brian Cho, Indranil Gupta; Making Cloud Intermediate Data Fault-Tolerant, SOCC 2010

Papers on Storage Systems

Papers on Storage Systems

Presentation Transcript

Energy Storage Systems

CSE598D Storage Systems

Storage Systems

Mass-Storage Systems

Papers on Storage Systems

Storage Systems

Intelligent Storage Systems

Disk Storage Systems

Mass-Storage Systems

Storage Systems

Discussion on Papers

Mass-Storage Systems

Storage Systems

Warehouse Storage Systems

Mass-Storage Systems

Storage Systems Performance

Storage Systems

Disk Storage Systems

Mass-Storage Systems