140 likes | 374 Views
Introduction to Big Data. Antonis Koukourikos NCSR “Demokritos”. s upported by:. Big Data Is…. Data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it. Big Data Sources.
E N D
Introduction to Big Data Antonis Koukourikos NCSR “Demokritos” 2nd SemaGrow Hackathon (in conjunction with IRSS14) supported by:
Big Data Is… Data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it 2nd SemaGrow Hackathon (in conjunction with IRSS14)
Big Data Sources Biomedical Information Sensor Data Logs E-mails Satellite images Audio and Video Streams Social Networks 2nd SemaGrow Hackathon (in conjunction with IRSS14)
Big Data Types Relational Text Semi-structured Graph data Streams 2nd SemaGrow Hackathon (in conjunction with IRSS14)
Big Data Challenges“The Three Vs” Volume …or is it 4…? Veracity …or is it 6… ?? Visualization Value 2nd SemaGrow Hackathon (in conjunction with IRSS14) Variety Velocity
Big Data demand… • Storage • Impractical or impossible to use centralized storage • Distribution • Federation • Indexing is a problem on itself • Computational power • For discovering • For searching / retrieving • For joining • Human effort and expertise • Querying can become complex • Are you sure you exploit all this information? 2nd SemaGrow Hackathon (in conjunction with IRSS14)
Big Data Applications • Aggregation and Statistics • Data warehouse and OLAP • Indexing, Searching, and Querying • Keyword based search • Pattern matching (XML/RDF) • Knowledge discovery • Data Mining • Statistical Modeling 2nd SemaGrow Hackathon (in conjunction with IRSS14)
Big Data Applications • In Health Sector • Analysis of medical data and patient records to estimate resubmission likeliness / take preemptive measures • Drug Discovery and Repurposing • In Consumer markets • Recommendation services • Behavioral patterns in Social Networks • Traffic pattern recognition • In Government Sector • Threat prediction • Social Program management • Crime prediction and prevention 2nd SemaGrow Hackathon (in conjunction with IRSS14)
No matter how big… • You can get bigger! • By combining Big Data with other sources • Several new challenges arise • How can I find new, relevant, useful data sources? • How can I use them? • How can I link them to my information? • Two major contributing initiatives • Open Data • Linked Data 2nd SemaGrow Hackathon (in conjunction with IRSS14)
Thank you ? ? ? 2nd SemaGrow Hackathon (in conjunction with IRSS14)