1 / 31

An Event-based Framework for Characterizing the Evolutionary Behavior of Interaction Graphs

An Event-based Framework for Characterizing the Evolutionary Behavior of Interaction Graphs. Sitaram Asur , Srinivasan Parthasarathy and Duygu Ucar Department of Computer Science The Ohio State University. Motivation. Protein-protein interactions in yeast (Jeong et al, 2001).

elaine
Download Presentation

An Event-based Framework for Characterizing the Evolutionary Behavior of Interaction Graphs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Event-based Framework for Characterizing the Evolutionary Behavior of Interaction Graphs Sitaram Asur, Srinivasan Parthasarathy and Duygu Ucar Department of Computer Science The Ohio State University

  2. Motivation Protein-protein interactions in yeast (Jeong et al, 2001) • Interaction Networks • Represent scientific data from various domains • Nodes represent entities • Edges represent interactions among entities • Examples: • Biological Networks - Protein-Protein Interaction (PPI) networks, gene expression networks • Collaboration networks • Social networks, online communities, blog networks Physicist collaboration network (Newman and Girvan, 2004)

  3. Motivation • Mining interaction networks important • Gain insight into structure, properties and behavior of these networks [Newman, 2001] • Modular nature of interaction networks important • Co-expression networks : dense components - > functional modules • Social networks : clusters -> community structure

  4. Motivation • A large number of earlier approaches focused on mining static interaction networks • Many important real-world networks are dynamic Ulrik de Lichtenberg, et al. Science 307, 724 (2005) Temporal protein interaction network of the yeast mitotic cell cycle.

  5. Motivation • Dynamic Interaction Networks • Nodes and interactions change over time • Structure changes in the network • Need for a structured method to characterize and model evolution • Understand nature of change (evolution) in networks • Consider evolution of individuals and communities • Develop models for reasoning and inference of future events

  6. Workflow Evolving Graph Temporal Snapshots Si Si+1 Clustering Ci Ci+1 Iterate i Analysis and Inference Event Detection Behavioral Patterns

  7. Temporal Snapshots • Split the graph data into non-overlapping temporal snapshots • Each snapshot corresponds to a graph • Consists of all nodes and interactions active in that time period • Nodes active if they have an interaction in a particular time period T1 T2 A A B B F F E E G G C D C D

  8. Clustering • Represent the snapshot graphs using clusters • Clusters of a graph can provide structure information • Examine the evolution of clusters over time • Can provide insight on corresponding changes to the graph • MCL clustering algorithm employed in this work • Ensemble clustering approaches can be employed to obtain robust clusters (Asur et al, ISMB 2007) T1 T2 A A B B F F E E G G C D C D

  9. 1 C 2 C 2 C 1 2 3 1 C 1 1 C C 6 4 5 3 2 C C 6 6 2 3 2 3 4 4 5 C C C C C C C 4 4 5 5 5 6 6 Community-based Event Detection • Continue • Merge • Split • Form • Dissolve 1 C T=2 T=3 T=1 T=5 T=4 T=6 1

  10. 1 C 2 A C 2 2 B 1 C 1 C 3 4 A A C 2 C 2 4 3 B B Entity-based Event Detection • Appear • Disappear • Join • Leave 1 C T=4 T=1 T=2 T=3 1 A C 2 1 B

  11. Event Detection • Represent each set of snapshot clusters as a k X N binary cluster-membership matrix • Use bitwise operators to compute the events between each successive pair of matrices (snapshots) • Example: Continue Event Continue (Cj, Ck) = AND (Si(j), Si+1(k)) == OR(Si(j), Si+1(k)) • Event Detection algorithm linear in the number of nodes in the graph O(N)

  12. Temporal Analysis • Use critical events for analysis • Form and Dissolve events • Used to study group formation and dissipation • Merge and Split events • Evolution of groups • Continue events • Stability of clusters/groups • Evolution of topics in a collaboration network

  13. Behavioral Analysis • Use entity-based critical events discovered to compose incremental measures for capturing behavioral patterns • Behavioral measures can then be used to analyze evolutionary behavior of nodes and clusters • Four Behavioral measures • Stability Index • Sociability Index • Popularity Index • Influence Index

  14. Case Study 1 : DBLP Collaboration network • Data from 28 key conferences in databases/data mining/AI over 10 years • Authors (nodes) connected by collaborations (edges) • 23136 nodes and 54989 edges • Collaboration networks display many of the structural features of social networks (Kempe, Kleinberg and Tardos 2003, Newman 2001)

  15. Case Study 2 : Clinical Trials Network • Clinical Trials • Can provide information on risks, benefits and optimal dosage levels. • Consists of observations of patients under drug use as well as some under placebo • Generally represented as a set of multivariate time series • Evolving clinical trials network • Nodes representing patients • Correlations among patients modeled as edges • Edges change over time as correlations change • Motivation: Use evolution of correlation to identify potential toxic effects of drugs

  16. Stability Index • Propensity of a node to interact with the same group of people over time • Stability for a node over time incrementally computed based on the stability of the clusters it belongs to

  17. Stability for Clinical Trials data • Nodes with low Stability Index values represent patients with fluctuating correlation values (outliers) • Null Hypothesis: • If the drug does not result in toxicity, then outliers are likely to be flagged at random from each group (drug and placebo). • Experiment on clinical trials network for diabetes patients • 19 nodes (patients) found having Stability Index below threshold. • The drug under study was discontinued due to possible toxic effects. 18 out of the 19 were on the drug!!!

  18. Sociability Index • Incremental measure of the different interactions a node participates in • Opposite of the Stability Index Does not represent degree!

  19. Sociability Index for Community Prediction • Goal : To identify future cluster co-occurrences based on history data for the DBLP dataset • Key Intuition: If two authors have high sociability, and they have not yet collaborated (not been clustered together), there is a high chance they will. • Setup : Use the data for 1997-2001 to predict cluster co-occurrences for 2002-2006

  20. Experimental Results • Comparison with other measures (Liben-Nowell and Kleinberg, CIKM 2003) • Common Neighbor • Adamic-Adar • Jacquard

  21. Popularity Index • Measure of attraction of nodes to a cluster • Influence measure of a cluster • Does not reflect the size of the cluster • DBLP dataset • Can be used to identify hot topics • If a large number of nodes join a cluster and they are all working on a similar topic, it indicates a buzz around that topic for that year

  22. Application of Popularity Index • Example : XML • Year 1999 : 3 authors (XML and web applications) • Year 2000 : 50 joins • 30 of these authors published papers on XML

  23. Influence Index • Measure of influence of a node on others • Influence in terms of participation in critical events • Influence of a node initially computed as • Follower nodes need to be pruned! unless

  24. Top Influential authors – DBLP dataset

  25. Diffusion Models • Study the spread of information in an evolving interaction network (Kempe et al, 2003, 2005) • Nodes activated with information • Newly activated nodes become contagious briefly • Information propagates through the network • Activation function maps weights of the links of a node to determine if it is activated • SUM Activation: If sum of weights > threshold, activate • MAX Activation: If any single weight > threshold, activate t1 t2 t3 t4

  26. Diffusion Models – Influence Maximization • Influence Maximization Problem : Find initial set of nodes that can activate the most number of nodes over a time period • Critical in applications such as viral marketing and for epidemiological research • Complicated in the case of dynamic interaction networks as the network changes over time • Need for dynamic measures that reflect the current status of the network • Sociability Index used to weight links • Highly sociable nodes have high propensity to pass on information • Influence Index to determine initial set of active nodes • Comparison with random choice of nodes and degree-based selection (Wasserman and Faust, 1994)

  27. Temporal Snapshots Clustering Analysis and Inference Event Detection Behavioral Patterns Conclusions • Most real-world graphs dynamic in nature • Need for analysis, reasoning and inference • Proposed an event-based framework • Clusters to capture structure at different snapshots • Critical events over clusters to identify dynamic properties of graphs • Behavioral patterns incrementally composed from critical events • Proposed method useful in many application domains • Protein function prediction, drug design, recommender systems, viral marketing, epidemiology

  28. Future Directions • Extensions to large interaction graphs • Use of semantic information for reasoning and inference • Merge and Split Events • If two clusters have high semantic similarity, probability of a Merge is high • Continue events • Track the evolution of topics • Sequences of Form, Continue, Continue … • Multi-scale temporal modeling • Analyze snapshots of different granularity

  29. Thanks! • Poster # 36, this evening (Mon 13th Aug, 6:15 – 9:15 pm) • This work was supported by the following grants: • DOE Early Career Principal Investigator AwardNo. DE-FG02-04ER25611 • NSF CAREER Grant IIS-0347662 • Contacts: • Sitaram Asur : asur@cse.ohio-state.edu • Dr Srinivasan Parthasarathy : srini@cse.ohio-state.edu • Duygu Ucar : ucar@cse.ohio-state.edu • Group Webpage : http://dmrl.cse.ohio-state.edu

  30. Event Detection

  31. Event Detection

More Related