80 likes | 382 Views
Apache Hadoop is an open source framework which enables users to write and run distributed applications that process large amounts of data. Distributed computing is a wide and varied field. It is related with Big Data, a collection of dataset.
E N D
Apache Hadoop – The Big Name In The Big Data World Java/J2EE Capabilities
What is Apache Hadoop? What is Apache Hadoop? • A proficient data management framework for Big Data • Open source software for distributed processing of large chunks of data • Offers distributed parallel processing across servers, ranging from a single server to multiple machines • Processing and analysis of thousands of terabytes of data • Apt framework to increase business efficiency and maximize ROI • Latest Release on 18 November, 2014: Release 2.6.0
Main Modules of Hadoop Main Modules of Hadoop
Main Modules of Hadoop (contd.) Main Modules of Hadoop (contd.) • Hadoop Common • Common utilities to help other Hadoop modules and support subprojects • Includes File System, RPC and serialization libraries • Hadoop Distributed File System (HDFS) • Distributed File System giving access to application data • Spans across all nodes in a Hadoop cluster to link them into one big file system • Java based, giving scalable and reliable data storage
Main Modules of Hadoop (contd.) Main Modules of Hadoop (contd.) • Hadoop YARN • Utilized for job scheduling and resource management of clusters • Splits up two roles of JobTracker, namely, resource management and job scheduling into different areas • Hadoop MapReduce • System for parallel processing of large data sets • A framework that gets into work assignment to nodes in a particular cluster • Writes applications processing large amount of data, on multiple nodes of hardware with utmost reliability
Other Hadoop Related Projects at Apache Other Hadoop Related Projects at Apache • Avro • Cassandra • Hbase • Hive • Pig • Spark • Ambari • Chukwa • Mahout • Tez • ZooKeeper
Why Hadoop? Why Hadoop? • Next generation real time analytics • Rich eco systems • Scale-out storage • Reduced cost of ownership • Scalability, Flexibility and Reliability • Fault tolerance • Simplistic programming models
THANK YOU Looking Forward To Have A Mutually Beneficial Association. Assuring You Of Our Best Services Always. SPEC INDIA "SPEC House“, Parth Complex, Swastik Cross Road, Navrangpura, Ahmedabad-380 009, INDIA. Tel.:+91-79-26404031 to 34 VoIP : + 1 - 908 - 450 - 9862 Instant Messengers spec.bd | spec_india | bd.spec specindia2009 | specindia.bd e-mail: lead@spec-india.com URL: http://www.spec-india.com