270 likes | 336 Views
Explore the impact of big data, Hadoop technology, challenges faced by relational databases, and emerging trends in data analytics. Discover how Hadoop revolutionizes data processing and storage in the distributed computing environment. Learn about NoSQL databases, MapReduce, and more.
E N D
A Seminar On Big Data Analytics Using Hadoop Presented By SaritaBagul TE Computer Seat No.T120414208 Under the guidance Asst.Prof.B.A.Khivsara
“A massive volume of both structured and unstructured data that is so large that it's difficult to process with traditional database and software techniques”.
Big data analytics is the process of collecting, organizing and analyzing large sets of data (called big data) to discover patterns and other useful information.
In this illustrated that in olden days through RDBMS tools ,the data was less and easily handled by RDBMS but recently it is difficult to handle huge data, which is preferred as “big data”. • Relational Databases Are Not Designed To Handle Change • Cost • No support for complex object such as documents,video,images etc. • Relational databases have limits on field lengths. • No support for unstructured data.
2006 - Yahoo! created Hadoop based on GFS and MapReduce (with Doug Cutting and team) • 2007 - Yahoo started using Hadoop on a 1000 node cluster • Jan 2008 - Apache took over Hadoop • Jul 2008 - Tested a 4000 node cluster with Hadoop successfully • 2009 - Hadoop successfully sorted a petabyte of data in less than 17 hours to handle billions of searches and indexing millions of web pages. • Dec 2011 - Hadoop releases version 1.0 • Aug 2013 - Version 2.0.6 is available • Nov 2014: Release 2.6.0 available • Dec, 2015: Release 2.6.3 available • Oct, 2016: Release 2.6.5 available
It limits scalability • Availability Issue • Problem with Resource Utilization • Limitation in running non-MapReduce Application
25 January, 2017: Release 3.0.0-alpha2 available • This is the second alpha in a series of planned alphas and betas leading up to a 3.0.0 GA release. The intention is to "release early, release often" to quickly iterate on feedback collected from downstream users.
To overcome the disadvantages of RDBMS, Hadoop is introduced in market. • Hadoop is an open source, Java-based programming framework that supports the processing and storage of extremely large data sets in a distributed computing environment.
There are many old technologies already present used for big data handling but each one of them has some advantages and disadvantages. There are number of technologies are there few of them are mentioned below: • Column-oriented databases • NoSQL databases • MapReduce • Hive • Pig • WibiData • PLATFORA • Apache Zeppelin • Hadoop
NoSQL (originally referring to SQL. or relational.) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relation databases (RDBMS). • This is backend database of hadoop.
Hadoop which is an open source software is a popular framework tool to handle the big data and used for big data analytics.
[1] Sethy, Rotsnarani, and Mrutyunjaya Panda "Big Data Analysis using Hadoop: A Survey." International Journal 5.7 (2015). • [2] Bhosale, Harshawardhan S., and Devendra P. Gadekar. "A Review Paper on BigData and Hadoop." International Journal of Scientic and Research Publications 4.10 (2014): 1. • [3] ]http://research.ijcaonline.org/volume108/number12/pxc3900288.pdf • [4] https://en.wikipedia.org/wiki/Big data • [5] Tom White,.Hadoop, The denitive guide.,OfReilly,3rd Edition • [6] https://www.google.co.in/?gfe rd=cr&ei=ayKnWJWmDe x8AfDyLnQDg&gws rd=ssl#q= hadoop + tutoria+ppt • [7] https://www.google.co.in/?gfe rd=cr&ei=ayKnWJWmDe x8AfDyLnQDg&gws rd=ssl#q= hadoop
[8] Bernice Purcell “The emergence of gbigdatah technology and analytics “Journal of Technology Research 2013. [9] https://www.google.co.in/search?q=Hadoop%2 C + a + distributed + framework +for + Big + Data &ie=utf-8&oeutf-8 &client = firefoxab&gfe rd = cr&ei =glXJWJyDMIKM4gL89IPACg [10] Gupta, Bhawna, and KiranJyoti. "Big data analytics with hadoop to analyze targeted attacks on enterprise data." (IJCSIT) International Journal of Computer Science and Information Technologies 5.3 (2014): 3867-3870. [11] Russom, Philip. "Big data analytics." TDWI best practices report, fourth quarter (2011): 1-35. [12] http://blogs.mindsmapped.com/bigdatahadoop/hadoop-advantages-and-disadvantages/ [13]http://www.tutorialspoint.com/articles/what-is-nosql-and-is-it-the-next-big-trend-in-databases [14] http://www.tutorialspoint.com/MongoDB/MongoDB-Application.htm [15]http://www.w3resource.com/mongodb/nosql.php [16] https://www.dezyre.com/article/5-healthcare-applications-of-hadoop-and-big-data/85 [17] https://www.tutorialspoint.com/hadoop/hadoop_enviornment_setup.htm