550 likes | 631 Views
Karthik provided a comprehensive understanding of available ecosystem tools and how they can be used to perform data engineering and data analytics. Karthik covers the following topics in his presentation:<br><br>• Establishment of complete data pipeline using big data ecosystem tools. <br>• Tackling of high velocity streams using various stream processing engines on cloud and performing Real Time analytics. <br>• Integration of big data ecosystem for data analysis using SAMOA , R and Mahout.<br>• Deployments of big data environments on the cloud. <br>See more at https://www.slideshare.net/machinepulse/managing-your-assets-with-big-data-tools-45931405
E N D
Managing your Assets with Big Data Tools KarthigaiMuthu, MachinePulse
Agenda Big Data value proposition Big Data Technology Stack
Hype Cycle for Emerging Technologies Source: Wikipedia
30 billion RFID tags today (1.3B in 2005) Sources of data 4.6 billion camera phones world wide 12+ TBsof tweet data every day 100s of millions of GPS enabled devices sold annually ? TBs ofdata every day 2+ billion people on the Web by end 2011 • 25+ TBs oflog data every day 76 million smart meters in 2009… 200M by 2014
What’s driving Big Data - Optimizations and predictive analytics - Complex statistical analysis - All types of data, and many sources - Very large datasets - More of a real-time - Ad-hoc querying and reporting - Data mining techniques - Structured data, typical sources - Small to mid-size datasets
The Evolution of Business Intelligence Big Data: Real Time & Single View Graph Databases Interactive Business Intelligence & In-memory RDBMS QlikView, Tableau,HANA Speed Scale BI Reporting OLAP & Data warehouse Business Objects, SAS, Informatica, Cognos other SQL Reporting Tools Speed Scale Big Data: Batch Processing & Distributed Data Store Hadoop/Spark; HBase/Cassandra/MongoDB 1990’s 2000’s 2010’s
Are you aware the risk of not implementing Big Data in your company
Big data changed connected things to Internet of Everything(IoE)
How do companies get MORE from big data Merge Optimize Respond Empower
Customer 360` Social Media Banking Finance Our Known History Gaming Purchase Entertain Customer
Real-Time Analytics/Decision Requirement Product Recommendations that are Relevant & Compelling Friend Invitations to join a Game or Activity that expands business Influence Behavior Improving the Marketing Effectiveness of a Promotion while it is still in Play Customer Learning why Customers Switch to competitors and their offers; in time to Counter Preventing Fraud as it is Occurring & preventing more proactively
Role of Big Data in M2M/IoT Big Data is a factor that will, to a large extent, determine the future growth rate in the M2M industry M2M will connect increasingly more nodes that will provide data from endpoints. Data will be more granular, more frequent, and more accurate, with bigger data sets or even live data streams Large volume of endpoint connections IPv4 addressing scheme can’t accommodate everything(sensors, smart phones, smart factories, smart grids, smart vehicles, controllers, meters ) that it requires IPv6 IoE= Convergence of IoT, Big Data Analytics ,Cloud Computing and other technologies is collectively called as Internet of Everything
Challenges of Big Data in M2M/IoT Meeting the need for speed Data understanding Maintaining data quality Displaying the meaningful result
Big Data Use Cases – IoT/M2M Personal IoT: the scope is a single person, such as a smartphone equipped with GPS sensor or a fitness device that measures the heart rate. This is one of the fastest growing, consumer-oriented areas of IoT. Group IoT: the scope is a fairly small group of people, such as a family in a smart house, co-workers in a van or a group of tourists. This is one of the most challenging areas and is still in its early phase. Community IoT: the scope is a large group of people, potentially thousands and more; usually this is in a public infrastructure context, such as smart cities or smart roads. This is a young and potentially promising IoT area. Industrial IoT: the scope can be within an organization (smart factory) or between organizations (retailer supply chain). This is arguably the most established and mature part of IoT.
Big Data Use cases – IoT/M2M Agriculture - sensors can be deployed on farm machinery in order to provide data about the equipment, soil temperature, moisture, etc. Buildings/Smart Homes - Building sensors be used to help facility managers become more proactive about ensuring that their buildings operate at peak efficiency. Communities – Smart cities make use of parking space availability systems, intelligent traffic monitoring systems, intelligent highways, weather-adaptive street lighting, and more. Healthcare – Infant monitors, smart diapers, pills with ingestible sensors are just some of the IOT-based devices. Manufacturing – factories with sensors can improve operations, product quality, and decrease safety hazards. Smartphones– can control everything from door locks, thermostats, light bulbs, vacuum cleaners, and more. Utilities – smart water meters can be used to reduce water leaks. Smart electric grids can adjust rates depending on usage. Wearables – Smart watches, fitness trackers and health monitors may become primary source for human-related data, and can also be used in sports, retail, travel and manufacturing.
Benefits of Big Data Analytics in M2M/IoT Device Maintenance: a. Time for next patch upgrade b. Energy management c. Inventory management and track replacement 2. Proactive Healthcare: Capture and analyze real time data from medical monitors to predict potential health problems before patients manifest clinical signs of infection. 3. Monetize Machine Data: a. Monitor performance, usage and capacity details to uncover up-sell and cross-sell opportunities b. Maximize the lifespan and performance of high value medical assets
Benefits of Big Data Analytics cont.. 4. Optimize Support Operations: a. Reduce MTTR and support escalations b. Preempt failures with proactive support c. Troubleshoot with accurate information d. Proactive consultation to customers on approaching expiry dates
Batch vs. Real-Time processing Batch processing - Gathering of data and processing as a group at one time. - Jobs run to completion - Data might be out of date Real-time processing - Processing of data that takes place as the information is being entered. - Run for ever
Storm Apache Storm is a free and open source distributed real-time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop did for batch processing
Storm Is Stream Processing Fast Scalable Fault Tolerant Reliable
Stream Grouping Groupings are used to decide to which task in the subscribing bolt (group) a tuple is sent. Possible Groupings: - Shuffle - Fields - All - Global - None - Direct - Local or Shuffle