150 likes | 511 Views
Big-Data Computing: Creating revolutionary breakthroughs in commerce, science, and society. Presented by: Motasim Albdarneh. 1. Outline. Introduction Other forms of big-data Big-Data Technology Technology and Application Challenges Leadership Recommendations Conclusion Reference. 2.
E N D
Big-Data Computing: Creating revolutionarybreakthroughs in commerce, science, and society Presented by: Motasim Albdarneh 1
Outline • Introduction • Other forms of big-data • Big-Data Technology • Technology and Application Challenges • Leadership • Recommendations • Conclusion • Reference 2
Introduction • Data-Driven World • Advances in digital sensors, communications, computation, and storage have created huge collections of data, capturing information of value to business, science, government, and society. • For example, search engine companies such as Google, Yahoo!, and Microsoft. These companies collect trillions of bytes of data every day such as satellite images, driving directions, and image retrieval. 3
Other forms of big-data • Wal-Mart contracted with HPto construct a data warehouse storing 4 petabytes of data, representing every purchase recorded by their point-of-sale terminals worldwide. • By applying machine learning to this data, they can detect patterns indicating the effectiveness of pricing strategies, advertising campaigns, etc.. • (LSST) Telescope will scan the sky, recording 30 trillion bytes of image data every day. • Astronomers will apply massive computing power to this data to probe the origins of our universe. 4
Cont... • Modern medicine collects huge amounts of information about patients through CAT scans, MRI and other forms of diagnostic equipment. • By applying data mining to data sets for large numbers of patients, medical researchers are gaining fundamental insights into the genetic and environmental causes of diseases, and creating more effective means of diagnosis. • These are a small sample of the ways that commerce, science, and society are being transformed by the availability of large amounts of data and the means to extract new forms of understanding from this data. 5
Big-Data Technology: Sense, Collect, Store, and Analyze The rising importance of big-data computing stems from advances in many different technologies: • Sensors: digital imagers (telescopes, video cameras, MRI machines), chemical and biological sensors (microarrays, environmental monitors), and even web pages. • Computer networks: via localized sensor networks, as well as the Internet. • Data storage: Advances in magnetic disk technology have dramatically decreased the cost of storing data. 6
Cont.. • Cluster computer systems: provide both the storage capacity for large data sets, and the computing power to organize the data, to analyze it, and to respond to queries about the data from remote users. • Cloud computing facilities: Businesses and individuals can rent storage and computing capacity, rather than making it. For example, Amazon Web Services (AWS). • Data analysis algorithms: The enormous volumes of data require automated or semi-automated analysis – techniques to detect patterns, identify anomalies, and extract knowledge. 7
Technology and Application Challenges • High-speed networking: bandwidth limitations. We need a “Moore’s Law” technology for networking, where declining costs for networking infrastructure combine with increasing bandwidth. • Cluster computer programming: Hardware and software errors. Major innovations have been made in methods to organize and program such systems, including the MapReduce programming framework introduced by Google. • Extending the reach of cloud computing:Bandwidth limitations and cost for tasks that require extensive computation over large amounts of data. In addition, the bandwidth limitations of getting data in and out of a cloud. 8
Cont.. • Machine learning and other data analysis techniques: more work is needed to develop algorithms that apply in real-world situations and on data sets of trillions of elements. • Widespread deployment: We expect "big-data science" – often referred to as eScience – to be pervasive. • Security and privacy: Unauthorized access and use. Along with developing technology to enable useful capabilities, we must create safeguards to prevent abuse. 9
Leadership • Industry has been in the lead, especially the Internet-enabled service companies. • Google, Yahoo!, and Amazon. Other companies, from retailers to financial services, are taking notice of the business advantages. • University researchers have been relatively late to this game, • due to lack of access to large-scale cluster computing facilities and to a lack of appreciation for the new insights that can be gained by scaling up to terabyte-scale data sets. • Government agencies are making heavy investments in traditional high-performance computing infrastructure and approaches, but very little in new eScience facilities and technologies. 10
Recommendations • Investments in big-data computing will have extraordinary near-term and long-term benefits. • The technology has already been proven in some industry sectors. • The challenge is to extend the technology and to apply it more widely. • Immediate Actions. Specific funding over the next two years could greatly stimulate the development, deployment, and application of big-data computing. 11
Cont.. • Longer Term Actions • enough budget • construct special-purpose data centers for the major eScience programs • Cloud computing must be considered a strategic resource • look beyond traditional high-performance computing. Many of needs could be addressed better and more cost effectively by cluster computing systems, possibly making use of cloud facilities. 12
Cont.. • Encourage the deployment and application of big-data computing in all facets. • Make fundamental investments in our networking infrastructure to provide ubiquitous, broadband access to end users and to cloud facilities. 13
Conclusion • Big-data computing is perhaps the biggest innovation in computing in the last decade. We have only begun to see its potential to collect, organize, and process data in all walks of life. • Hello Governments !! 14
Reference Big-Data Computing: Creating revolutionary breakthroughs in commerce, science, and society Randal E. Bryant Randy H. Katz Edward D. Lazowska Carnegie Mellon University of University of University California, Berkeley Washington Version 8: December 22, 20081