1 / 8

Impact of different relation extraction methods on network analysis results

Impact of different relation extraction methods on network analysis results. Jana Diesner. Need: scalable, reliable, robust methods & tools. Unstructured At any scale. Network Analysis Answer substantive and graph-theoretic questions Develop and test hypothesis and theories

Download Presentation

Impact of different relation extraction methods on network analysis results

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Impact of different relation extraction methods on network analysis results Jana Diesner

  2. Need: scalable, reliable, robust methods & tools • Unstructured • At any scale • Network Analysis • Answer substantive and graph-theoretic questions • Develop and test hypothesis and theories • Visualizations • Populate databases • Input to further computations, e.g. simulations, machine learning Motivation • Text Data • Network Data • Applications

  3. Research Questions and Relevance • How do network data and analysis results obtained by using different relation extraction methods compare to each other? • Why does it matter? • Increased comparability, generalizability, transparency of methods and tools • Increased control and power for developers and users • Supports drawing of reasonable and valid conclusions

  4. Relation Extraction Methods Meta-data (META) Subject Matter Experts (SME) Text, automated (TextA) Text, manual (TextM) Meta-Data Database query Codebook Proximity-based linkage of nodes Proximity-based linkage of nodes Proximity-based linkage of nodes

  5. Data • Large-scale, over-time, open source data from different domains

  6. Results I • Text automated vs. manual: total number of nodes of sub-type “generic” far higher than “specific” • Rethink focus of network analysis: collectives vs. individuals • Importance of detecting unnamed entities • Ground truth data (SME) hardly resembled by analyzing text bodies and not at all by meta-data networks • In most ideal case, 50% of nodes and 20% of links • Agreement in structure and key entities depends on type of network

  7. Results II • Agreement between text-based, and with meta-data depends on type of network • For more complete view, combine automated text-based with meta-data network

  8. Acknowledgements • This work was supported by the National Science Foundation (NSF) IGERT 9972762, the Army Research Institute (ARI) W91WAW07C0063, the Army Research Laboratory (ARL/CTA) DAAD19-01- 2-0009, the Air Force Office of Scientific Research (AFOSR) MURI FA9550-05-1-0388, the Office of Naval Research (ONR) MURI N00014‐08‐11186, and a Siebel Scholarship. Additional support was provided by the CASOS Center at Carnegie Mellon University. The views and conclusions contained in this talk are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of the NSF, ARI, ARL, AFOSR, ONR, or the United States Government. • Thank You! Questions, Comments, Feedback: jdiesner@illinois.edu

More Related