120 likes | 553 Views
Hadoop. Ali Sharza Khan High Performance Computing. Table of Content. Hadoop Where did Hadoop come from ? What problems can Hadoop solve? Where does Hadoop applies to ? How is Hadoop architected? Two main parts of Hadoop Conclusion. Hadoop. What is Hadoop ?
E N D
Hadoop Ali Sharza Khan High Performance Computing
Table of Content • Hadoop • Where did Hadoop come from ? • What problems can Hadoop solve? • Where does Hadoop applies to ? • How is Hadoop architected? • Two main parts of Hadoop • Conclusion
Hadoop • What is Hadoop ? • Open Source project • Processing Large data sets in parallel
Where did Hadoop come from? • Google • Yahoo, Facebook, Twitter and Linkedln are actively contributing towards Hadoop.
What problems can Hadoop solve? • Where you have lot of data • Run analytics that are deep and computational extensive
Where does Hadoop applies to ? • Search engine • Finance • Online Retail • Government • Media and entertainment • Research Institution and other market
How is Hadoop architected? • Every server has 2 or 4 or 8 Cpu’s. • Each server operates on its own little piece of data. • Hadoop clusters at Yahoo covers 25000 servers, and store 25 petabytes of application data. • The largest cluster being 3500 servers.
Cloudera CEO Interview http://www.youtube.com/watch?v=qNP4_ICDeqE
Two main parts of Hadoop • HDFS (Hadoop Distributed File System) • Map Reduce Framework • Map Phase • Reduce Phase • JobTracker (The master) • TaskTracker (The slave)
Conclusion • Why Hadoop is able to deal with lots of data? • Why Hadoop is able to compute complicated Computational questions?