220 likes | 367 Views
Application of Confidence Intervals to Text-based Social Network Construction. By CDT Julie Jorgensen, 06, G4 Advisors: MAJ Ian McCulloh, D/MATH LTC John Graham, D/BS&L. Agenda. The Real-World Problem Text Analysis/Social Network Analysis Solution Social Network Analysis
E N D
Application of Confidence Intervals to Text-based Social Network Construction By CDT Julie Jorgensen, 06, G4 Advisors: MAJ Ian McCulloh, D/MATH LTC John Graham, D/BS&L
Agenda • The Real-World Problem • Text Analysis/Social Network Analysis Solution • Social Network Analysis • Simple Text Analysis • A Better Solution • Themed Analysis • Example Case – Jihadist Texts • Theme Scores • Network Construction Procedure • Jihadist Network • Results • Importance and Conclusions
The Real-World Problem • Commanders need to understand “Human Terrain” • Majority of ‘HT’ information is in text form • The Combating Terrorism Center receives volumes of data every day. • Harmony Database is being rapidly declassified • Need an efficient way to plow through large amounts of text data and see the linkages. • Solution: Text Analysis Displayed in Social Network Analysis
Social Network Analysis • A mathematical method of quantifying connections between individuals or groups and drawing conclusions from those connections • Assumes rational beings are interdependent • Nodes • Key Actors • Links • Relationships between Nodes
Iraq Elections Barzani Khamenei
Demonstration Data Set:Jihadist Texts • Approx. 250 translated texts • MEMRI • FBIS • Other Sources • 15 Authors • More than 1 text • Not well known
Simple Text Analysis: The Plagiarism Check Problem • Word matching is overly simple. • Ignores context • Actors can be overly weighted by writing more
Alternative: Themed Analysis • Traditional Network Analysis Methods • Citation Analysis • Physical Network • Communication or Financial Network • Themed Analysis • Relates nodes across multiple fields • One similar theme versus many similar themes
Theme Scores • Problem • Commander needs information in representations he/she understands. • Networks can compare authors across single themes • But difficult to compare authors across multiple themes *Theme Score is the sum of each word’s score per text
Constructing a Network Across Multiple Themes • Scrub Texts • Construct Theme Scores • Construct Confidence Intervals • Discern Similarity between Nodes • Binary or Standardized Difference of Means • Create Square Matrix • Draw Network *why not ANOVA?
Confidence Intervals • 95% Confidence Interval = • Each Author, Each Theme • Example:
Relationship Scores • Each possible pair of authors per theme • Overlapping Confidence Intervals • Disparate Confidence Intervals
Matrix Construction • Multiplication of Scores for each author and each theme Geometric Mean = • Resultant Square Matrix
Theme Analysis: Confidence Interval vs Average • Able to look at each theme individually. • Average Rank does not account for connections importance, weighting, predictors • Themes are combined • Can see connections between authors across a combination of themes.
Conclusions • Socially Engineered Algorithms involve extensive tradeoffs and decisions by the mathematician that can significantly impact commander’s decision-making. • Multiple views of the same data is a critical requirement. • Find Linkages in large amounts of data • Find Connections across multiple fields • Non-Tangible Relationships • Real World: Track / Catch criminals / radical ideologues • Representation of Human Terrain
Future Work • Publish method in Journal of Computational and Mathematical Organization Theory • Integration into ORA (Organizational Risk Analysis) Statistical Software: In use by Intelligence Analysts. • Analysis of change over time
References • Dr. Jaret Brachman. Combating Terrorism Center, USMA. • Dr. Steven Corman. Hugh Downs School of Human Communication, Arizona State University. • http://www.checkpoint-online.ch/CheckPoint/Images/N-HusseinCapture.jpg • http://www.salmac.co.za/profile-writing-arabic.gif • Wasserman, Stanley and Katherine Faust. Social Network Analysis: Methods and Applications. New York: Cambridge University Press, 1994, 4.