1 / 8

An introduction to Apache Whirr

A short introduction to Apache Whirr. What is it and how does it relate to the cloud ?How can it be used with Hadoop ?

semtechs
Download Presentation

An introduction to Apache Whirr

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Apache Whirr • What is it ? • How does it work? • The Cloud • Architecture • Use with Hadoop www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  2. Apache Whirr – What is it ? • A library based cloud service system • API libraries for cloud providers • Choose a configuration file to define a cluster • High level interaction • Service based on roles www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  3. Apache Whirr – How does it work ? • Libraries provided to offer high level API • Based on JClouds • Recipe based approach for cloud providers i.e. EC2 • It is cloud neutral • Create clusters as you need them for • Dev / test etc www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  4. Apache Whirr – How does it work ? • Automatically start instances on the Cloud • Configure and start Hadoop • Add applications like • Hive • Hbase • Yarn / MapReduce www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  5. Apache Whirr – Why go virtual ? • Whirr gives independence from Cloud vendor • Makes it easier to move vendors later • Save money by only using what you need • Expand the cluster as demand requires • Reduce the cluster when possible • Compress data as much as possible to reduce costs • Virtual cost < physical cost ( should be ) until • Data sizes in high Tbyte – low Pbyte range www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  6. Apache Whirr – Supported • What cloud suppliers are available • Amazon EC2 • Rackspace Cloud Services • What services do they support ? • Cassandra • Hadoop • Zoo Keeper • Hbase • Elastic Search • Voldemort • Hama www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  7. Apache Whirr – Example config Example Whirr configuration - .whirr/credentials whirr.provider=aws-ec2 whirr.identity=your-aws-key whirr.credential=your-aws-secret Whirr Hadoop configuration - hadoop.properties whirr.cluster-name=myhadoopcluster whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 hadoop-datanode+hadoop-tasktracker whirr.provider=aws-ec2 Start Whirr whirr launch-cluster --config hadoop.properties www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  8. Contact Us • Feel free to contact us at • www.semtech-solutions.co.nz • info@semtech-solutions.co.nz • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems

More Related