1 / 22

Application of Confidence Intervals to Text-based Social Network Construction

Application of Confidence Intervals to Text-based Social Network Construction. By CDT Julie Jorgensen, 06, G4 Advisors: MAJ Ian McCulloh, D/MATH LTC John Graham, D/BS&L. Agenda. The Real-World Problem Text Analysis/Social Network Analysis Solution Social Network Analysis

azuka
Download Presentation

Application of Confidence Intervals to Text-based Social Network Construction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Application of Confidence Intervals to Text-based Social Network Construction By CDT Julie Jorgensen, 06, G4 Advisors: MAJ Ian McCulloh, D/MATH LTC John Graham, D/BS&L

  2. Agenda • The Real-World Problem • Text Analysis/Social Network Analysis Solution • Social Network Analysis • Simple Text Analysis • A Better Solution • Themed Analysis • Example Case – Jihadist Texts • Theme Scores • Network Construction Procedure • Jihadist Network • Results • Importance and Conclusions

  3. The Real-World Problem • Commanders need to understand “Human Terrain” • Majority of ‘HT’ information is in text form • The Combating Terrorism Center receives volumes of data every day. • Harmony Database is being rapidly declassified • Need an efficient way to plow through large amounts of text data and see the linkages. • Solution: Text Analysis Displayed in Social Network Analysis

  4. Social Network Analysis • A mathematical method of quantifying connections between individuals or groups and drawing conclusions from those connections • Assumes rational beings are interdependent • Nodes • Key Actors • Links • Relationships between Nodes

  5. “Human Terrain” Example: 9/11 Hijacker Network

  6. Iraq Elections Barzani Khamenei

  7. Demonstration Data Set:Jihadist Texts • Approx. 250 translated texts • MEMRI • FBIS • Other Sources • 15 Authors • More than 1 text • Not well known

  8. Simple Text Analysis: The Plagiarism Check Problem • Word matching is overly simple. • Ignores context • Actors can be overly weighted by writing more

  9. Alternative: Themed Analysis • Traditional Network Analysis Methods • Citation Analysis • Physical Network • Communication or Financial Network • Themed Analysis • Relates nodes across multiple fields • One similar theme versus many similar themes

  10. Demonstration: Text Analysis

  11. Theme Scores • Problem • Commander needs information in representations he/she understands. • Networks can compare authors across single themes • But difficult to compare authors across multiple themes *Theme Score is the sum of each word’s score per text

  12. Constructing a Network Across Multiple Themes • Scrub Texts • Construct Theme Scores • Construct Confidence Intervals • Discern Similarity between Nodes • Binary or Standardized Difference of Means • Create Square Matrix • Draw Network *why not ANOVA?

  13. Confidence Intervals • 95% Confidence Interval = • Each Author, Each Theme • Example:

  14. Relationship Scores • Each possible pair of authors per theme • Overlapping Confidence Intervals • Disparate Confidence Intervals

  15. Matrix Construction • Multiplication of Scores for each author and each theme Geometric Mean = • Resultant Square Matrix

  16. Themed Network

  17. Theme Analysis: Confidence Interval vs Average • Able to look at each theme individually. • Average Rank does not account for connections importance, weighting, predictors • Themes are combined • Can see connections between authors across a combination of themes.

  18. Method Comparison

  19. Conclusions • Socially Engineered Algorithms involve extensive tradeoffs and decisions by the mathematician that can significantly impact commander’s decision-making. • Multiple views of the same data is a critical requirement. • Find Linkages in large amounts of data • Find Connections across multiple fields • Non-Tangible Relationships • Real World: Track / Catch criminals / radical ideologues • Representation of Human Terrain

  20. Future Work • Publish method in Journal of Computational and Mathematical Organization Theory • Integration into ORA (Organizational Risk Analysis) Statistical Software: In use by Intelligence Analysts. • Analysis of change over time

  21. Questions?

  22. References • Dr. Jaret Brachman. Combating Terrorism Center, USMA. • Dr. Steven Corman. Hugh Downs School of Human Communication, Arizona State University. • http://www.checkpoint-online.ch/CheckPoint/Images/N-HusseinCapture.jpg • http://www.salmac.co.za/profile-writing-arabic.gif • Wasserman, Stanley and Katherine Faust. Social Network Analysis: Methods and Applications. New York: Cambridge University Press, 1994, 4.

More Related