310 likes | 465 Views
Big Data. SUNY IITG. Outline. Introduction Similar Topics What Does Big Data Look Like? Why Use Big Data? How Is It U seful? What Companies R ely O n Big Data? Summary Questions References. Where (big) data starts to play a role. What exactly is Big Data.
E N D
Big Data SUNY IITG
Outline • Introduction • Similar Topics • What Does Big Data Look Like? • Why Use Big Data? • How Is It Useful? • What Companies Rely On Big Data? • Summary • Questions • References
What exactly is Big Data • extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions. • Q:, and why does the University need an institute? • A: Currently, many domains, including science, engineering, health care, environmental science, e-commerce and, increasingly, the humanities, generate massive amounts of data. These data are accumulating to the point where making sense of them is a huge challenge. And it’s not just about size, speed and variety, but also the complexity of the data sets, the enormous numbers of variables and the uncertainty in measurements from global environmental monitoring systems, studies of gene expression and others. Because data can come from multiple sources, they also must be integrated, creating additional complications. • So we need a Big Data Institute at the University because many of the problems we face in science, engineering, health care and the humanities require very powerful tools for answering questions across the domains. And because U.Va. is a complete university, we can combine our efforts to solve some of the most challenging problems.
History of data • Data was recorded by human beings since ancient time. • Experts generate data • everage people consume data • In recent century, easier by computers • Still experts or specialized software supply generate data • Everagehumanbeings consume data • Nowadays, by human beings using various devices and software • By everybody
“Big” data • Data on paper • Usage of computers create big data problem, with the following characteristics • Faster. High speed, rapidity with which data comes in • Easier • Varieties • Large amount • Various resources
Multiple faceted • “Big data” is something that has multiple definitions,
resources • Sensor network • High precision devices • Surveillance • …
More accurate analyses may lead to more confident decision making. And better decisions can mean greater operational efficiencies, cost reductions and reduced risk.
(Predictive Analytics, Big Data) • What is Data Mining
Who is in the jungle ? • SAS
Introduction to Big Data • Extremely large data sets which grow exponentially • Difficult to process with traditional methods • Reveals patterns, trends & associations • What is Big Data and how does it work?
Introduction (continuted) • Unimportant: Size of data • Important: Ability to analyze such a data set
Graphical Representation • “There were 5 exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days.” • Eric Schmidt • Google Employee (2010)
Confusion on Big Data • Not usually confused with another concept • Confusion occurs when trying to understand why
Why Use Big Data? • Better management of data • Speed, capacity & scalability benefits • Better visualization of data • Data analysis capabilities will evolve
Big Data Analyzing Software • Amazon • Amazon DynamoDB • Amazon Redshift • DataStax • Cassandra • Developed by Facebook, inspired by DynamoDB
Who Relies on Big Data? • Majority of companies • Retail • Amazon • Entertainment • Netflix • Health • MyFitnessPal
Amazon • Predicts what customers want before they start their search • “Frequently Bought Together” • “Customers Who Bought This Item Also Bought”
Netflix • Analyze viewing habits • Improve suggestions
MyFitnessPal • Extensive database with information immediately available
Big Data’s Influence for the Future • Healthcare industry could save $300 billion a year by using big data analytics • Big Data has helped predict crimes three times more accurately than current forecasting • Companies involved in retail could increase profit by more than 60%
Fun Facts • 1.9 million IT jobs will be created by 2015 to work with Big Data projects • Data transferred via mobile networks increased by 81% every month between 2012 and 2014 • NSA is only capable of analyzing 1.6% of all internet traffic per day
Summary Big Data has a much larger impact on daily life than the majority of society realizes. Whether they’re actively on the internet or shopping at the grocery store, data is being recorded and analyzed instantaneously. Without Big Data the technology available to us today would not be as reliable or successful.
Questions • What is more important, the amount of data or the process by which the data is analyzed? • The process of analyzing the data • What is the pattern of growth for Big Data? • Exponential growth
References • http://news.virginia.edu/content/uva-appoints-engineering-professor-don-brown-lead-new-big-data-institute • http://www.sas.com/en_us/insights/big-data/what-is-big-data.html • http://oxfamblogs.org/fp2p/what-is-the-future-impact-of-big-data/#prettyPhoto • http://smartdatacollective.com/bernardmarr/232941/top-10-big-data-quotes-all-time • http://www.cio.com/article/2385690/big-data/5-reasons-to-move-to-big-data--and-1-reason-why-it-won-t-be-easy-.html • http://www.informationweek.com/big-data/big-data-analytics/13-big-data-vendors-to-watch-in-2013/d/d-id/1107738?page_number=1 • http://www.slideshare.net/BernardMarr/big-data-25-facts