280 likes | 401 Views
MINFS544: Network-based Business Intelligence (BI). Feb 19 th , 2013 Daning Hu, Ph.D., Department of Informatics University of Zurich. F Schweitzer et al. Science 2009. Stop Contagious Failures in Banking Systems.
E N D
MINFS544: Network-based Business Intelligence (BI) Feb 19th, 2013 Daning Hu, Ph.D., Department of Informatics University of Zurich F Schweitzer et al. Science 2009
Stop Contagious Failures in Banking Systems • During 2008 financial tsunami, which bank(s) we should inject capital first to stop contagious failures in bank networks?
Utilize Peer Influence in Online Social Networks • Intelligent Advertising, Product Recommendation • Who are the most influential people? • What are the patterns of information diffusion?
Develop Strategies to Attack Terrorist Networks A Global Salafi Jihad Terrorist Network Hu et al. JHSEM 2009 • How to effectively break down a terrorist network?
Network-based Business Intelligence Network-based (Modeling and Analysis) • Modeling and analyzing various real-world social and organizational networks to understand: • the cognitive and economic behaviors of the network actors; and • the dynamic processes behind the network evolution Based on the above… Business Intelligence (BI) • Design network-based BI algorithms and information systems to provide decision support in various application domains • Financial Risk Management, Security Informatics, and Knowledge Management, etc. • Network Analysis, Simulation of Network Evolution, Data Mining, etc.
MINFS544: Network-based Business Intelligence • Lecturer: Dr. Daning Hu; Teaching Assistant: Dr. Jiaqi Yan • Email: hdaning@ifi.uzh.chjackiejqyan@gmail.com • Credits: 3 ECTS credits • Class Meetings: Tue 14:00-15:45 PM, or Thu 10:15 –12:00 pm (Please see the schedule) • Language: English • Audience: Master and doctoral students • Office Hours: Tue 13:00–14:00 PM, Room 2.A.12 • Grading: Course report (term paper) 70%, presentation 20%, participation 10%
Grading • 1. A full research paper (70%). The format of this paper can be found at: http://icis2012.aisnet.org/index.php/submissions • * If possible, get it published in ICIS 2013 and get it cited. • This paper should include answers to the following questions: • What is the problem? • Why is it interesting and important? • Why is it hard? Why have previous approaches failed? • What are the key components of your approach? • What 1) models, 2) data sets and 3) metrics will be used to validate the approach?
Grading • 2. Oral presentation of the paper (using slides) + Q&A (20%) • For presentations, please see slide on How to give a good research talk at: • http://research.microsoft.com/en-us/um/people/simonpj/papers/giving-a-talk/giving-a-talk.htm • 3. Active participations and interactions (10%)
A Brief History of Network Science • Mathematicalfoundation – Graph Theory 1736 1930 • SocialNetwork Analysis and Theories • Sociogram: Network visualization • Six degree of separation • Structural hole: Source of innovation 1990 • (Physicists)Complex Network Topologies • Small-world model (e.g., WWW) • Scale-free model (“Rich get richer”) 2000 • Network Science • Economic networks (Agent modeling & simulation) • Dynamic network analysis • BI applications: product diffusion in social media, recommendation systems 2012 • ?
Outline • Introduction • Dynamic Analysis of Dark Networks • A Global Salafi Jihad (GSJ) Terrorist Network • A Narcotic Criminal Network • A Network Approach to Managing Bank Systemic Risk • Ongoing Work • Conclusion
Dynamic Network Analysis (DNA) • Studying dynamic link formation processes behind network evolution. • Nodes forming links Network Evolution How What Why • Simulate the evolution of networks • Agent-based Modeling and Simulation • Examine network robustness • Model the changes in network evolution • Temporal changes in network topological measures • Dynamic network recovery on longitudinal data • Statistical analysis of determinants behind link formation • Homophily • Preferential attachment • Shared affiliations
Research Testbed: A Global Terrorist Network • The Global Salafi Jihad (GSJ) network data is compiled by a former CIA operation officer Dr. Marc Sageman- 366 terrorists • friendship, kinship, same religious leader, operationalinteractions, etc. • geographical origins, socio-economic status, education, etc. • when they join and leave GSJ • The goal of dynamic analysis • gain insights about the evolution of GSJ network • develop effective attack strategies to break down GSJ network Sample data of GSJ terrorists
Dynamic Network Analysis • Studying dynamic processes (i.e., link formation) behind network evolution. • Nodes’ behaviors Network Evolution How What Why • Simulate the evolution of networks • Agent-based Modeling and Simulation • Examine network robustness • Model the changes in network evolution • Temporal changes in network topological measures • Dynamic network recovery on longitudinal data • Statistical analysis of determinants behind link formation • Homophily • Preferential attachment • Shared affiliations
Temporal Changes in Network-level Measures b a • Fig.1. The temporal changes in the (a) average degree, (b) and (c) degree distribution • Degree = number of links a node has c
Findings • There are three stages for the evolution of the GSJ network: • 1989 - 1993 The emerging stage: • The network grows in size • Accelerated Growth - No. of edges increases faster than nodes • Random network topology (Poissondegree distribution) • 1994 - 2000 The mature stage: • The size of the network reached its peak in 2000 • Scale-free topology (Power-law degree distribution) • 2001 - 2003 The disintegrationstage: • Falling into small disconnected components after 9/11
Temporal Changes in Node Centrality Measures • Figure.2. Temporal changes in Degree and Betweenness centrality of Osama Bin Laden • Degree: No. of links a node has • Betweenness of a node i • No. of shortest paths from all nodes to all others that pass through node i • Measure i’s influence on the traffic (information, resource) flowing through it
Findings and Possible Explanations • 1994 – 1996: A sharp decrease in Bin Laden’s Betweenness • 1994: Saudi revoked his citizenship and expelled him • 1995: Went to Sudan and was expelled again under U.S. pressure • 1996: Went to Afghanistan and established camps there • 1998 –1999: Another sharp decrease in his Betweenness • After 1998 bombings of U.S. embassies, Bill Clinton ordered a freeze on assets linked to bin Laden (top 10 most wanted) • August 1998: A failed assassination on him from U.S. • 1999: UN imposed sanctions against Afghanistan to force the Taliban to extradite him
Research Testbed: A Narcotic Criminal Network • The COPLINK dataset contains 3 million police incident reports from the Tucson Police Department (1990 to 2006). • 3 million incident reports and 1.44 million individuals • Their personal and sociological information (age, ethnicity,etc.) • Time information: when two individuals co-offend • AZ Inmate affiliation data: when and where an inmate was housed • A Narcotic Criminal Network • 19,608 individuals involved in organized narcotic crimes • 29,704 co-offending pairs (links) Table 1. Summary of the COPLINK dataset and the Arizona inmate dataset 21
Statistical Analysis of Determinants for Link Formation • Proportional hazards model (Cox Regression Analysis) • Homophilyinage (group) and race • Shared affiliations: • Mutual acquaintances (through crimes) • Vehicle affiliation (same vehicle used by two in different crimes) Fig.3. Results of multivariate survival (Cox regression) analysis of triadic closure (link formation).
BI Application: Co-offending Prediction in COPLINK • IBM’s COPLINK is an intelligent police information system aims to to help speed up the crime detection process. • COPLINK calculates the co-offending likelihood score based on the proportional hazards model . • A ranked list of individuals based on their predicted likelihood of co-offending with the suspect under investigation. Fig.4. Screenshots of the COPLINK system
Simulate Attacks on Dark Networks • Three attack (i.e. node removals) strategies: • Attack on hubs (highest degrees) • Attack on bridge (highest betweenness) • Real-world Attack (Attack order based on real-world data) • Simulate two types of attacks to examine the robustness of the Dark networks • Simultaneous attacks (the degree/betweenness of nodes are NOT updated after each removal) – Static • Progressive attacks (the degree/betweenness of nodes are updated after each removal) – Dynamic
Hub Vs. Bridge Attacks • Both hub and bridge attacks are far more effective than real-world arrests – Policy implications? • Both Dark networks are more vulnerable to Bridgeattacks than Hubattacks. • Bridge (highest beweenness): Field lieutenants, operational leaders, etc. • Hub (highest degree) : e.g., Bin Laden
Summary and Contributions • We developed a set of Dynamic Network Analysis (DNA) methods that are effective in • Linking network topological changes to analytical insights • Systematically capturing the link formation processes • Examining the determinants of link formation • Dark networks are • robust against real-world attacks • but vulnerable to targeted bridge attacks • COPLINK provides real-time decision support for fighting crimes.
Research Readings and Resources • 1. Networks Overview: • * Statistical mechanics of complex networks, Section III, VI • http://rmp.aps.org/abstract/RMP/v74/i1/p47_1 • * Networks, Crowds, and Markets: • http://www.cs.cornell.edu/home/kleinber/networks-book/ • 2. Networks in Finance: • * Financial Networks blog and research databases: • WRDS database • http://www.financialnetworkanalysis.com/research-database/ • http://www.stern.nyu.edu/networks/electron.html • * Company Board Social Networks
Research Readings and Resources (cont.) • 3. Networks in Marketing: • * Sinan Aral’s research in networks and marketing • Peer influence • http://web.mit.edu/sinana/www/ • * Social Media based Marketing: • http://searchengineland.com/guide/what-is-social-media-marketing • 4. Recommender Systems: • http://www-cs-students.stanford.edu/~adityagp/recom.html • 5. Word-of-Mouth Effects in Social Networks: • http://papers.ssrn.com/sol3/papers.cfm?abstract_id=393042&