80 likes | 219 Views
Data has been piling up in organizations since a number of years but since some time, because of the prevailing fervor behind ‘Big Data’ and ‘Business Intelligence’, there is awareness and availability of valued information and accurate storage of data to organizations, which is why they are happily storing their heaps of data and extracting desired information in required format.
E N D
Apache Hadoop – The Big Name In The Big Data World Java/J2EE Capabilities
What is Apache Hadoop? What is Apache Hadoop? • A proficient data management framework for Big Data • Open source software for distributed processing of large chunks of data • Offers distributed parallel processing across servers, ranging from a single server to multiple machines • Processing and analysis of thousands of terabytes of data • Apt framework to increase business efficiency and maximize ROI • Latest Release on 18 November, 2014: Release 2.6.0
Main Modules of Hadoop Main Modules of Hadoop
Main Modules of Hadoop (contd.) Main Modules of Hadoop (contd.) • Hadoop Common • Common utilities to help other Hadoop modules and support subprojects • Includes File System, RPC and serialization libraries • Hadoop Distributed File System (HDFS) • Distributed File System giving access to application data • Spans across all nodes in a Hadoop cluster to link them into one big file system • Java based, giving scalable and reliable data storage
Main Modules of Hadoop (contd.) Main Modules of Hadoop (contd.) • Hadoop YARN • Utilized for job scheduling and resource management of clusters • Splits up two roles of JobTracker, namely, resource management and job scheduling into different areas • Hadoop MapReduce • System for parallel processing of large data sets • A framework that gets into work assignment to nodes in a particular cluster • Writes applications processing large amount of data, on multiple nodes of hardware with utmost reliability
Other Hadoop Related Projects at Apache Other Hadoop Related Projects at Apache • Avro • Cassandra • Hbase • Hive • Pig • Spark • Ambari • Chukwa • Mahout • Tez • ZooKeeper
Why Hadoop? Why Hadoop? • Next generation real time analytics • Rich eco systems • Scale-out storage • Reduced cost of ownership • Scalability, Flexibility and Reliability • Fault tolerance • Simplistic programming models
THANK YOU Looking Forward To Have A Mutually Beneficial Association. Assuring You Of Our Best Services Always. SPEC INDIA "SPEC House“, Parth Complex, Swastik Cross Road, Navrangpura, Ahmedabad-380 009, INDIA. Tel.:+91-79-26404031 to 34 VoIP : + 1 - 908 - 450 - 9862 Instant Messengers spec.bd | spec_india | bd.spec specindia2009 | specindia.bd e-mail: lead@spec-india.com URL: http://www.spec-india.com