240 likes | 249 Views
Explore the impact of big data in the aviation industry, from optimizing capital investments to detecting potential threats in real-time. Learn how to analyze and integrate large volumes of data for improved decision-making.
E N D
Big Data Technologies for Civil and Defense Aviation Bruce Brown, BigData Technical Specialist brownb@us.ibm.com
“Data is the new Oil” “At the World Economic Forum last month in Davos, Switzerland, Big Data was a marquee topic. A report by the forum, “Big Data, Big Impact,” declared data a new class of economic asset, like currency or gold. “Big Data has arrived at Seton Health Care Family, fortunately accompanied by an analytics tool that will help deal with the complexity of more than two million patient contacts a year…” “Increasingly, businesses are applying analytics to social media such as Facebook and Twitter, as well as to product review websites, to try to “understand where customers are, what makes them tick and what they want”, says Deepak Advani, who heads IBM’s predictive analytics group.” “Companies are being inundated withdata—from information on customer-buying habits to supply-chain efficiency. But many managers struggle to make sense of the numbers.” “…now Watson is being put to work digesting millions of pages of research,incorporating the best clinical practices and monitoring the outcomes to assist physicians in treating cancer patients.” The Oscar Senti-meter — a tool developed by the L.A. Times, IBM and the USC Annenberg Innovation Lab — analyzes opinions about the Academy Awards race shared in millions of public messages on Twitter.” “Data is the new oil.” Clive Humby In its raw form, oil has little value. Once processed and refined, it helps power the world.
The Characteristics of Big Data Cost efficiently processing the growing Volume Responding to the increasing Velocity Collectively analyzing the broadening Variety 30 Billion RFID sensors and counting 80%of the worlds data is unstructured 50x 35 ZB 2010 2020 Establishing the Veracity of big data sources 1 in 3 business leaders don’t trust the information they use to make decisions
Vestas optimizes capital investments based on 2.5 Petabytes of information. • Model the weather to optimize placement of turbines, maximizing power generation and longevity. • Reduce time required to identify placement of turbine from weeks to hours. • Incorporate 2.5 PB of structured and semi-structured information flows. Data volume expected to grow to 6 PB. 5 5 5
University of Ontario Institute of Technology (UOIT) Detects Neonatal Patient Symptoms Sooner Capabilities Utilized: Stream Computing • Performing real-time analytics using physiological data from neonatal babies • Continuously correlates data from medical monitors to detect subtle changes and alert hospital staff sooner • Early warning gives caregivers the ability to proactively deal with complications Significant benefits: • Helps detect life threatening conditions up to 24 hours sooner • Lower morbidity and improved patient care “Helps detect life threatening conditions up to 24 hours sooner” 7 7
TerraEchos Turns to IBM Big Data for Low Latency Surveillance Data Analysis Capabilities Utilized: Stream Computing • Deployed security surveillance system to detect, classify, locate, and track potential threats at highly sensitive national lab • Stream computing collects and analyzes acoustic data from fiber-optic sensor arrays • Analyzed acoustic data fed into TerraEchos intelligence platform for threat detection, classification, prediction & communication Significant benefits: • Enables Terraechos solution to analyze and classify streaming acoustic data in real-time • Provides lab & security staff with holistic view of potential threats & non-issues • Enables a faster and more intelligent response to any threat “Identifies and classifies potential security threats – miles away” 8 8 8
Pacific Northwest Smart Grid Demonstration Project • Capabilities: • Stream Computing – real-time control system • Data Warehouse Appliance – analyze massive data sets • Demonstrates scalability from 100 to 500K homes while retaining 10 years’ historical data • 60k metered customers in 5 states • Accommodates ad hoc analysis of price fluctuation, energy consumption profiles, risk, fraud detection, grid health, etc. 9
In Order to Realize New Opportunities, You Need to Think Beyond Traditional Sources of Data Transactional and Application Data Machine Data Social Data Enterprise Content • Volume • Structured • Throughput • Velocity • Semi-structured • Ingestion • Variety • Highly unstructured • Veracity • Variety • Highly unstructured • Volume
Leveraging Big Data Requires Multiple Platform Capabilities Understand and navigate federated big data sources Federated Discovery and Navigation Hadoop File System MapReduce Manage & store huge volume of any data Data Warehousing Structure and control data Stream Computing Manage streaming data Text Analytics Engine Analyze unstructured data Integrate and govern all data sources Integration, Data Quality, Security, Lifecycle Management, MDM
Business-centric Big Data enables you to start with a critical business pain and expand the foundation for future requirements “Big data” isn’t just a technology—it’s a business strategy for capitalizing on information resources Getting started is crucial Success at each entry point is accelerated by products within the Big Data platform Build the foundation for future requirements by expanding further into the big data platform 12
1 – Unlock Big Data Customer Need Understand existing data sources Expose the data within existing content management and file systems for new uses, without copying the data to a central location Search and navigate big data fromfederated sources Value Statement Get up and running quickly and discover and retrieve relevant big data Use big data sources in new information-centric applications Customer examples Proctor and Gamble – Connect employees with a 360° view of big data sources Get started with: IBM Vivisimo Velocity
2 – Analyze Raw Data Customer Need Ingest data as-is into Hadoop and derive insight from it Process large volumes of diverse data within Hadoop Combine insights with the data warehouse Low-cost ad-hoc analysis with Hadoop to test new hypothesis Value Statement Gain new insights from a variety and combination of data sources Overcome the prohibitively high cost of converting unstructured data sources to a structured format Extend the value of the data warehouse by bringing in new types of data and driving new types of analysis Experiment with analysis of different data combinations to modify the analytic models in the data warehouse Customer examples Financial Services Regulatory Org – managed additional data types and integrated with their existing data warehouse Get started with: InfoSphere BigInsights
3 – Simplify your Warehouse Customer Need Business users are hampered by the poor performance of analytics of a general-purpose enterprise warehouse – queries take hours to run Enterprise data warehouse is encumbered by too much data for too many purposes Need to ingest huge volumes of structured data and run multiple concurrent deep analytic queries against it IT needs to reduce the cost of maintaining the data warehouse Value Statement Speed and Simplicity for deep analytics (Netezza) 100s to 1000s users/second for operation analytics (IBM Smart Analytics System) Customer examples Catalina Marketing – executing 10x the amount of predictive workloads with the same staff Get started with: IBM Warehouse Solutions 16
4 – Reduce costs with Hadoop Customer Need Reduce the overall cost to maintain data in the warehouse – often its seldom used and kept ‘just in case’ Lower costs as data grows within the data warehouse Reduce expensive infrastructure used for processing and transformations Value Statement Support existing and new workloads on the most cost effective alternative, while preserving existing access and queries Lower storage costs Reduce processing costs by pushing processing onto commodity hardware and the parallel processing of Hadoop Customer examples Financial Services Firm – move processing of applications and reports to Hadoop HBase while preserving existing queries Get started with: IBM InfoSphere BigInsights
5 – Analyze Streaming Data Customer Need Harness and process streamingdata sources Select valuable data and insights to be stored for further processing Quickly process and analyze perishable data, and take timely action Value Statement Significantly reduced processing timeand cost – process and then storewhat’s valuable React in real-time to capture opportunities before they expire Customer examples Ufone – Telco Call Detail Record (CDR) analytics for customer churn prevention Get started with: InfoSphere Streams
Entry Points are Accelerated by Products Within theBig Data Platform Analytic Applications 1 – Unlock Big Data IBM Vivisimo BI / Reporting Exploration / Visualization FunctionalApp IndustryApp Predictive Analytics Content Analytics BI / Reporting IBM Big Data Platform 3 – Simplify your warehouse IBM Warehouse Solutions Visualization & Discovery Application Development Systems Management 2 – Analyze Raw Data InfoSphere BigInsights Accelerators HadoopSystem Stream Computing Data Warehouse 5 – Analyze Streaming Data InfoSphere Streams 4 – Reduce costs with Hadoop InfoSphere BigInsights Information Integration & Governance
Big Data Platform Video/Imagery Analytics Operating System Transport System S Data Fabric X86 Blade X86 Blade X86 Box FPGA Blade Cell Blade Real-time Events Tracking and Linking(Actionable Intelligence) Cognos/i2/BigSheets/Browser Visualization 1 3 Historical View Broadcast Video Visual Semantic Classification Machine Learning 2 User-Generated Content Sites InfoSphere BigInsights 4 Bootstrap and Enrich Video Blogs Offline Video Analytics Real-Time Video Analytics
Automatic Semantic Classification of Virat Data http://www.viratdata.org/ http://ibm64c.watson.ibm.com/imars/virat/
The Grand Challenge: Analyze a Large Volume and Variety of Streaming and Static Data to Produce Actionable Intelligence Complex Event Processing and Inter-correlated effects to other aspects of ABA Patterns of Life and Behavior Modeling Social Networks Open Source News Entity Relationships and Contextual Relevance Find the relevant dots,connect them, tell me what I don’t know, keep it up to date. System of Reference: Social, Political Weather, etc., influences and constraints on Observation Space Video Activity Detection and Tracking Historical Data Predictive Modeling and Cognitive Awareness Cell Phone Anomaly Detection
The IBM Big Data Platform Enables Complex ABA Architectures! Volumes of raw data (structured and unstructured) in file systems (often highly distributed) Data Mining, Data Exploration, and Predictive Analytics Data Warehouse Real-Time Analytics Event Detection Situational Awareness InfoSphere BigInsights InfoSphere Information Server InfoSphere Identity Insights Global Name Recognition InfoSphere Sensemaking (Future) Operational Data Store Relationship Resolution Entity Analytics IBM Confidential InfoSphere Streams Text Analytics & Natural Language Processing Traditional data sources (ERP, CRM, databases, etc.) Real-time streaming data (structured and unstructured)
SMARC Big Data Solution Architecture InfoSphere Streams Online Flow: Data-in-motion analysis Social Media Data Entity Analytics: Profile Resolution Text Analytics: Timely Insights Predictive Analytics: Action Determination Timely Decisions Data Ingest & prep. Comprehensive Social Media Customer profiles Social Media Data Customer Models Text Analytics Entity Analytics and Integration Predictive Analytics InfoSphere BigInsights Offline Flow: Data-at-rest analysis • Large-scale data-at-rest analysis using InfoSphere BigInsights • Large-scale data-in-motion analysis using InfoSphere Streams • Advanced text analysis, entity integration, and predictive modeling using common analytics infrastructure across Streams and BigInsights
Flight Data Solution Operating System Transport System S Data Fabric X86 Blade X86 Blade X86 Box FPGA Blade Cell Blade Cognos Visualization and Intelligence High Performance Analysis Data Mining ARCINC 717 Flight Data ? Text Analytics Data Archival (PB+) Analytics Pilot unstructured Data (e.g. Emails, Text Files) Realtime Flight Data Monitoring InfoSphere BigInsights