1 / 11

Hadoop

Hadoop. Ali Sharza Khan High Performance Computing. Table of Content. Hadoop Where did Hadoop come from ? What problems can Hadoop solve? Where does Hadoop applies to ? How is Hadoop architected? Two main parts of Hadoop Conclusion. Hadoop. What is Hadoop ?

jennis
Download Presentation

Hadoop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hadoop Ali Sharza Khan High Performance Computing

  2. Table of Content • Hadoop • Where did Hadoop come from ? • What problems can Hadoop solve? • Where does Hadoop applies to ? • How is Hadoop architected? • Two main parts of Hadoop • Conclusion

  3. Hadoop • What is Hadoop ? • Open Source project • Processing Large data sets in parallel

  4. Where did Hadoop come from? • Google • Yahoo, Facebook, Twitter and Linkedln are actively contributing towards Hadoop.

  5. What problems can Hadoop solve? • Where you have lot of data • Run analytics that are deep and computational extensive

  6. Where does Hadoop applies to ? • Search engine • Finance • Online Retail • Government • Media and entertainment • Research Institution and other market

  7. How is Hadoop architected? • Every server has 2 or 4 or 8 Cpu’s. • Each server operates on its own little piece of data. • Hadoop clusters at Yahoo covers 25000 servers, and store 25 petabytes of application data. • The largest cluster being 3500 servers.

  8. Cloudera CEO Interview http://www.youtube.com/watch?v=qNP4_ICDeqE

  9. Two main parts of Hadoop • HDFS (Hadoop Distributed File System) • Map Reduce Framework • Map Phase • Reduce Phase • JobTracker (The master) • TaskTracker (The slave)

  10. MapReduceFrameWork

  11. Conclusion • Why Hadoop is able to deal with lots of data? • Why Hadoop is able to compute complicated Computational questions?

More Related