280 likes | 440 Views
The United States air transportation network analysis. Dorothy Cheung. Introduction. The problem and its importance Missing Pieces Related works in summary Methodology Data set Network Generation Network Analysis Conclusion. Outline. The problem and its importance Missing Pieces
E N D
The United States air transportation network analysis Dorothy Cheung
Introduction • The problem and its importance • Missing Pieces • Related works in summary • Methodology • Data set • Network Generation • Network Analysis • Conclusion
Outline • The problem and its importance • Missing Pieces • Related works • Methodology • Data set • Network Generation • Network Analysis • Conclusion
The problem and its importance • Problem • Analysis the air transportation network in the U.S. • Network driven by profits and politics • Better understand the network structure not maximize utility • Importance • Economy: transport of good and services • Air traffic flow: convenience • Health studies: propagation of diseases
Outline • The problem and its importance • Missing Pieces • Related works • Methodology • Data set • Network Generation • Network Analysis • Conclusion
Missing pieces • Sufficient amount of researches on the network with focuses on utility optimization. • Commercial enterprises: OAG and Innovata • But … lack of research on analyzing the network features studied in class.
Outline • The problem and its importance • Missing Pieces • Related works • Methodology • Data set • Network Generation • Network Analysis • Conclusion
Related worksAir transportation networks analysis • WAN – World-wide Airport Network • ANI – Airport Network of India • ANC – Airport Network of China
Related worksSummary: Features of air transportation networks • Small world network (compared with random graphs) • Small average shortest path • High average clustering coefficient • Degree mixing differs • Scale free power law degree distribution
Outline • The problem and its importance • Missing Pieces • Related works • Methodology • Data set • Network Generation • Network Analysis • Conclusion
Methodology • Data Set • Network Generation • Network Analysis
Methodology – Data Set T100 OAI RITA BTS DATABASE My data Legends OAI : Office of Airline Information RITA : Research and Innovative Technology Administration BTS : Bureau of Transportation Statistics
Methodology – Data Set Domestic Air Traffic Hubs [1]
Methodology – Data Set • Domestic scheduled flights • Passengers, cargos, and mails • Military excluded • Market Data vs. Segment Data • Market : Used • Accounts for passenger once on the same flight number • Segment : Not used • Accounts for passenger more than once per leg • Month specific : July 2011
Methodology – Data Set • Relevant information • Number of Passengers • Number of Cargos : Freight and Mail • Origin City • Destination City Sample .csv from BTS
Methodology – Network Generation • Network • 850 Nodes: airports • 21405 entries • Weighted edges: sum of passengers and cargos • Directed and Undirected network input files for Pajak [2] and GUESS [5].
Methodology – Network Generation .CSV GenerateNwk Microsoft.Jet.OLEDB4.0Provider Network Generation Tool written in C# using LINQ (Language Integrated Query) ParseCSV Data Table LINQ PajekDirected.net PajekUndirected.net GUESSDirected.gdf GUESSUndirected.gdf
Methodology – Network Generation The U.S. Air Transportation Network drawn in Pajek
Methodology – Network Analysis • Metrics • Degree distributions and correlations • Top 10 most connected cities • Top 10 most central cites • Small world network? • Shortest path length • Clustering coefficient • Compare against WAN, ANI, and ANC • Cumulative degree distribution and the power law • Resilience • Associativity : Rich-club? • Random graph • Z-Score TBD?
Methodology – Network Analysis • Degree distributions and correlations • Directed network • Pajek: • In degree : Net -> Partitions -> Degree -> Input • Out degree : Net -> Partitions -> Degree -> Output • Both : Net -> Partitions -> Degree -> All • Shortest path length • Directed network • Pajek: • Net -> Paths between 2 vertices -> Diameter • Clustering coefficient • Directed network • Pajek: • Net -> Paths between 2 vertices -> Diameter
Methodology – Network Analysis • Cumulative degree distribution and the power law • Directed network Step 1 in Pajek: • Create a partition of all degree • Export the partition in a tab delimited file • Tools -> Export to Tab Delimited File -> Current Partition Step 2 in MatLab [6]: • Generating a power law integer distribution X = GetInput.m: reads the partition from the tab delimited file (X => X.name, X.label, X.degree) • Calculating the cumulative distribution cumulativecounts.m [4] [xlincumulative,ylincumulative] = cumulativecounts(X.degree)
Methodology – Network Analysis • Resilience What % of nodes are removed to reduce the size of the Giant component by half? • Consider: • Random attack • Targeted attack : remove nodes with the highest degree and betweenness centrality measures • Undirected network with 850 nodes • GUESS toolbars: resiliencedegree.py and resiliencebetweenness.py that are downloaded from cTools[4] • Compare against a random network (Random and targeted attacks) GUESS : makeSimpleRandom(numberOfNodes, numberOfEdges) => numberOfNodes = 850 numberOfEdges = 21405
Methodology – Network Analysis • Associativity : Rich-club? • Draw conclusion from graphical analysis in GUESS • Random graph • Difficulty in constructing a realistic random network that models the real network [3]. • Z-Score? • To Be Determined.
Methodology – Network Analysis • Expectations/Predictions • Larger degree nodes are more central (betweenness). Consider LAX, SFO, HOU, JFK, etc. • Small world as compared to WAN, ANI, and ANC • Scale free power law distribution • Dissociate
Outline • The problem and its importance • Missing Pieces • Related works • Methodology • Data set • Network Generation • Network Analysis • Conclusion
Conclusion The United States air transportation network analysis • The problem and its importance • Missing Pieces • Related works – WAN, ANI, ANC • Methodology • Data set : BTS : Bureau of Transportation Statistics • Network Generation : Directed and Undirected network input files • Network Analysis : • Degree distribution • Small world network as compared to WAN, ANI, and ANC • Cumulative degree distribution and power law • Resilience • Associativity • z-score – TBD?
References for this presentation • T-100 reporting guide, RITA, http://www.rita.dot.gov/, www.transtats.bts.gov, http://www.bts.gov/programs/airline_information/. • Pajak, program for large network analysis, http://vlado.fmf.uni-lj.si/pub/networks/pajek/. • Albert-Laszlo Barabasi and Reka Albert, “Emergence of Scaling in Random Networks”, Department of Physics, University of Notre-Dame, October, 1999. • CTools, https://ctools.umich.edu/portal. • GUESS, graph exploration system, http://graphexploration.cond.org/. • Matlab, The language of technical computing, http://www.mathworks.com/products/matlab/index.html