1 / 8

Twister4Azure : Iterative MapReduce for Azure Cloud

Twister4Azure : Iterative MapReduce for Azure Cloud. CCA 2011 April 12 – 13, 2011. Thilina Gunarathne , Judy Qiu , Geoffrey Fox { tgunarat , xqiu,gcf }@ indiana.edu. MapReduceRoles for Azure. Familiar MapReduce programming model

kalea
Download Presentation

Twister4Azure : Iterative MapReduce for Azure Cloud

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Twister4Azure : Iterative MapReduce for Azure Cloud CCA 2011 April 12 – 13, 2011 ThilinaGunarathne, Judy Qiu, Geoffrey Fox {tgunarat, xqiu,gcf}@indiana.edu

  2. MapReduceRolesfor Azure • Familiar MapReduce programming model • Built using highly-available and scalable Azure cloud services • Co-exist with eventual consistency & high latency of cloud services • Decentralized control • No single point of failure. • Supports dynamically scaling up and down of the compute resources. • MapReduce fault tolerance

  3. MapReduceRolesfor Azure

  4. Twister for Azure • Merge Step • In-Memory Caching of static data • Cache aware hybrid scheduling using Queues as well as using a bulletin board (special table)

  5. Twister for Azure

  6. Performance – Kmeans Clustering Performance with/without data caching. Speedup gained using data cache Increasing number of iterations Scaling speedup

  7. Performance Comparisons BLAST Sequence Search Smith Watermann Sequence Alignment Cap3 Sequence Assembly

  8. Conclusion Enables users to easily and efficiently perform large scale iterative data analysis and scientific computations on Azure cloud. Utilizes a novel hybrid scheduling mechanism to provide the caching of static data across iterations. Utilize cloud infrastructure services effectively to deliver robust and efficient applications. http://salsahpc.indiana.edu/twister4azure

More Related