200 likes | 352 Views
Cross-Platform Aviation Analytics Using Big-Data Integration Methods 2013 Integrated Communications N avigation and Surveillance (ICNS) C onference April 25 , 2013. Dr. Tulinda Larsen Vice President tulinda@masflight.com Mobile. +1 (443) 510-3566 4833 Rugby Avenue, Suite 301
E N D
Cross-Platform Aviation Analytics Using Big-Data Integration Methods2013 Integrated Communications Navigation and Surveillance (ICNS) Conference April 25, 2013 Dr. Tulinda Larsen Vice President tulinda@masflight.com Mobile. +1 (443) 510-3566 4833 Rugby Avenue, Suite 301 Bethesda, Maryland 20814 www.masflight.com
THE ANALYSIS CHALLENGE The Analysis Challenge Scale and complexity of aviation data limits research applications Problems Acquiring Data Fuel and Oil Conservation • Obtaining radar and airport data, schedules, weather maps and forecasts, fleet information • Real-time transmission of very large data • Proprietary and inconsistent formats • No conditioning or validation Gate and Terminal Use Weather Plan & Ops Recovery Problems Analyzing Information • Using data for strategic planning and recovery,cost improvement and new market opportunities • Goes beyond desktop capability • Time-consuming manual slicing of data • Need weather and competitor information to answer key operational questions Pilot and Crew Staffing OperationalOptimization Big-data analytical methods can address these challenges
CLOUD COMPUTING What is Cloud Computing? The cloud consists of terrestrial servers across the Internet that collectively store, manage and process data • Figurative “Cloud” The term comes from the common use of a cloud-shaped symbol as an abstraction for the Internet but application to virtual servers is as recent as 2006 • Cloud computing is the use of resources (hardware and software) that are delivered as a service over the Internet or other network Identity Network Monitoring Content Content Application Platform Infrastructure Financial Metrics Object Storage Communication Computation Collaboration Databases Storage Cloud Computing Architecture
CLOUD COMPUTING What are Cloud Architectures? Shared between users Public Cloud Providers Community Cloud Providers Low Industry Expertise High Industry Focus Managed Services Private Providers Company Private Clouds Private Infrastructure In aviation cloud resources can be customized and shared among consortiums of customers (community cloud) or shared with customers in other industries (public cloud) Cloud computing services can be delivered by an internal IT organization (company-owned private cloud) or By an external service provider (managed services private cloud or public cloud provider) or
BIG-DATA ANALYTICS What is Big-Data Analytics? • The process of examining diverse, large-scale data sets to uncover patterns, unknown correlations and other useful information • Organizations have different levels of (1) database management expertise and (2) knowledge to process and analyze big data sets • “Big data” is a relative term based on the user • Data tables in excess of ten terabytes (10TB) are difficult to work with using most relational database management systems, and particularly using desktop statistics and visualization packages, including Microsoft Excel and Access • Unstructured data sources in the operational world simply do not fit into desktop or small-scale database structures • They canbe hosted using cloud computing at lower cost, and mined more efficiently, than with on-premises database architectures
BIG-DATA ANALYTICS What are Big-Data Analytics Tools? • Big-data analytics employ software tools from advanced analytics disciplines such as data mining and predictive analytics. • Mining data, trends or analysis of these multi-terabyte data sets requires parallel software running on tens, hundreds, or even thousands of servers to keep pace with user demands and processing expectations. • A new class of big-data methods have emerged to address user demands for horizontal scaling and availability of underlying data • Hadoopand MapReduce, among others, offer fast processing speed. • Great for large-scale static data sets, but not so great for real-time data • Most organizations employ a hybrid method combining technologies • A robust open source framework supports processing in clustered systems. • Platform-as-a-service vendors (Microsoft, Amazon, Google) offer turn-key solutions for analysts to simply upload, link and compute basic data sets • Great for simple historical analysis; bad for real-time or diverse data sets
MASFLIGHT masFlight: A Global Aviation Data Warehouse and Big-Data Analytics Platform • Hybrid Architecture • Physical architecture for secure data feeds • Cloud-based instances for linking • Managed cloud data tables • Integrates with local BI and warehouses • Redundancy • Multi-source data acquisition • Real-time validation and processing • Replication across cloud infrastructure • Load balancing and parallel processing • Backup • Cluster processing to reduce dependencies • Monitored data integrity and performance • Multiple geographic zones and clusters • Imaging of tables for replication • Customization • Customizable for specific user requirements • Dashboards and web templates • Integrated internal data in warehouse • Connect to local BI systems
DATA AND APPLICATIONS masFlight’s Data and Applications Platform OUR CUSTOMER APPLICATIONS OUR CLOUD-BASED DATA WAREHOUSE Web Application (masflight.com) In-House ServersFor private gov’t feeds Data Input Feeds HTML 5 / Ruby Analyst focused Customizable Fast deployment SaaS revenue model Reference and Static DataGeospatial, airline, airport info Current WeatherGlobal hourly conditions Secure External Network Forecast WeatherStandard and severe forecasts Cloud WarehouseLinked Information60TB structured data Dashboards & Web Services Flight SchedulesWhat’s planned to operate REST web services Feed internal systems Custom dashboards Flexible interfaces Airport & Gate Status Multisource, real-time feeds Secure U.S./Canada Radar Authorized direct access Robots and Java Applications Cloud Managed Database Hosting Other Airspace DataSatellite and transponder info Virtual tables Updated in real time Bypass constraints Ultimate customization Government Economic Data Revenue and audited data Automated collection
MASFLIGHT PLATFORM masFlight Platform Multisource, integrated airline operations data Our platform shows where, when and why problems occur • Examine diversions, cancellations, delays and determine root causes • Deep-dive into airport gates, taxi times, and runway patterns • Analyze air space usage and air traffic management Planned Flight Schedules AirportRunway Data Airport Gate & Terminal Data Airline Ops Data Multisource Flight Status U.S. Radar Data Airline Fleet Information Global Weather Data and Maps Key Partners and Suppliers:
END TO END CAPABILITY Big-Data Analytics Facilitates End-to-End Analysis A full picture of each flight is critical for analyzing operations Query flights from planned schedule through post-operation recovery Up to 500 data points per flight KIAD V268 SWANN 1502Z 1550Z 1620Z Origin weather Origin information Operating airline Scheduled times Departure gate/time Taxi-out/takeoff times Arrival weather Destination information Landing/taxi times Arrival gate/time Diversion data Aircraft information Flight plan filed Actual path flown Congestion Weather diversionsEn-route times and fixes Other sources only offer limited, disaggregated and unformatted regional data
COVERAGE A Global Solution masFlight tracks flights, airports and weather around the world North and South America EMEA and Asia • Global daily flight information capture • 82,000 flights • 350 airlines • 1700 airports • Integrated weather data for 6,000 stations • Match weather to delays • Validate block forecasts at granular level • Add weather analytics to IRROPS review and scenario planning White lines are flights in the masFlight platform from February 8, 2013. Yellow pins are weather stations feeding hourly data to our platform. Maps from Google Earth / masFlight
TOWER CLOSINGS Example 1: Proposed FAA Tower Closures masFlight used big-data to link airport operations across three large data sets: • Current and historical airline schedules • Raw Aircraft Situation Display to Industry (ASDI) radar data from the FAA • Enhanced Traffic Management System Counts (ETMS), including Airport operations counts by type (commercial, freight, etc.), departure & arrival Findings: Proposed Tower Closings Dots indicate closures; Red dots have scheduled service • From schedules database: 55 airports with scheduled passenger airline service • 14 EAS Airports • From ASDI & ETMS: 10,600 weekly flights on a flight plan (ex. VFR and local traffic) • 6,500 Part 91/125 weekly flights • 4,100 Part 135/121 weekly flights Based on scheduled service March 1 – 7, 2013; scheduled service includes scheduled charter flights, cargo flights, and passenger flights
TOWER CLOSINGS Example 1: Big-Data Analytics Applied to ASDI and ETMS To Analyze Operations Source: ASDI radar data – Part 91/151 flying and Part 135/121 flying – March 1-7, 2013; masFlight analysis Note: Average “daily“ operations based on 5-day week
CAUSAL FACTORS Example 2: Aviation Safety Causal Factor Data-mining algorithms can mine the text of safety reports to obtain specific data that can be used to analyze causal factors. For example, consider the following ASRS report (ACN 1031837): “Departing IAH in a 737-800 at about 17,000 FT, 11 miles behind a 737-900 on the Junction departure over CUZZZ Intersection. Smooth air with wind on the nose bearing 275 degrees at 18 KTS. We were suddenly in moderate chop which lasted 4 or 5 seconds then stopped and then resumed for another 4 or 5 seconds with a significant amount of right rolling… I selected a max rate climb mode in the FMC in order to climb above the wake and flight path of the leading -900. We asked ATC for the type ahead of us and reported the wake encounter. The -900 was about 3,300 FT higher than we were.” • Synopsis • B737-800 First Officer reported wake encounter from preceding B737-900 with resultant roll and moderate chop. What causal factors can be identified from this narrative thatcould be applied to future predictive applications?
CAUSAL FACTORS Example 2: Identifying Causal Factors Big data gives us visibility into contextual factors even if specific data points are missing such as a specific date or route. Big-data analytics gives us insight into unreported factors as well.
COMPARING OTP AND UTILIZATION Example 3: Correlating Utilization and Delays Daily Utilization vs. On-time Departures January 2013 System Operations Correlation Coefficient -0.53 NarrowbodiesBy Day of Week Includes AA, AC, AS, B6, F9, FL, NK, UA, US, VX and WN Widebodiesby Day of Week SOURCE: masFlight (masflight.com)
UTILIZATION BY HUB Example 4: Daily Utilization of Gates, by Hub Big-data analysis of different carriers – daily departures per gate used June 1 through August 31, 2012. Gates with minimum 1x daily use SOURCE: masFlight (masflight.com)
CONCLUSIONS Conclusions for Big Data in Aviation • Big-data transforms operationaland commercial problems that were practically unsolvable using discrete data and on-premises hardware • Big data offers new insight into existing data by centralizing data acquisition and consolidation in the cloud and mining data sets efficiently • There is a rich portfolio of information that can feed aviation data analytics • Flight position, schedules, airport/gate, weather and government data sets offer incredible insight into the underlying causes of aviation inefficiency. • Excessive size of each set forces analysts to consider cloud based architectures to store, link and mine the underlying information • When structured, validated and linked, these data sources become significantly more compelling for applied research than they are individually • Today’s cloud based technologies offer a solution
CONCLUSIONS Conclusions: Our Approach • masFlight’sdata warehouse and analysis methods provide a valuable example for others attempting to solve cloud based analytics of aviation data sets • masFlight’shybrid architecture, consolidating secure data feeds in on-premises server installations and feeding structured data into the cloud for distribution, addresses the unique format, security and scale requirements of the industry • masFlight’smethod is well suited for airline performance review, competitive benchmarking, airport operations and schedule design, and has demonstrated value in addressing real-world problems in airline and airport operations as well as government applications