90 likes | 255 Views
Statistical and Graph-Theoretical Approaches to Time-Varying Multigraphs. Robert Bell Colin Goodall (PI) Sylvia Halasz AT&T Labs–Research. Goal of Project. To analyze and apply automated anomaly detection to dynamic multigraphs in telecomm, blogs, and intelligence data
E N D
Statistical and Graph-Theoretical Approaches to Time-Varying Multigraphs Robert Bell Colin Goodall (PI) Sylvia Halasz AT&T Labs–Research
Goal of Project • To analyze and apply automated anomaly detection to dynamic multigraphs in telecomm, blogs, and intelligence data • Have communication patterns changed? • Volume of communications • Types of communications • New connections
Builds on Bio Surveillance Work • Ongoing work of Goodall, Halasz, et al. • Timely, automated detection of outbreaks • Flu or other illnesses • Pinpoint location • Data from Emergency Departments or other sources • Novel method for pre processing free-form text data • Kalman Filter for Contingency Tables (KFC) • Looks for changes from historic behavior • Cross-classified data streams • Handles many outcomes and locations simultaneously • Visualization tools play a central role
2004 NJ Meningitis Scare: Hospital Admits Where do the patients live?
Time Varying Multi Graphs (TVMGs) • Graphs depict relationships among entities (nodes) • Edges represent relationships between entities • Direct communications such as calls, e-mails, etc. • Indirect communications, e.g., visiting the same blog • Relationships may vary over time • Multigraph refers to additional complexity • Entities of multiple types • Distinct type communications (cell, land line, etc.) • Various attributes of edges (e.g., call volume)
Analysis of TVMGsPoses Additional Challenges • Data do not fit usual rectangular structure expected by most statistical procedures • Nodes and edges are fundamentally different • Complicated dependencies are common • Requires new paradigms for data storage • Requires adapting existing methods for anomaly detection
2007 Summer ProjectCagatay Bilgin, RPI • Developed methods for storing and manipluating graphs • Built Communities of Interest • Created Java Time Varying Data Analysis Toolkit and Visualization (jTVDATV) • Adapted KFC method for anomaly detection in time varying multigraphs
Illustration of Tools for Anomaly Detection • Fictitious data from the VAST competition • Data consists of news stories and blog entries • Entities include people, places, organizations, species, and dates FBI Washington, DC FBI-Washington, DC