Work partially funded by Rhône-Alpes and Saint-Étienne Métropole. August 2012.

Combining relations and text in scientific network clustering David COMBE1, Christine LARGERON1, Előd EGYED-ZSIGMOND2, Mathias GÉRY1 1. CNRS, UMR 5516, Laboratoire Hubert Curien, Université de Saint-Étienne, Jean-Monnet, Saint-Étienne, France Email : {david.combe, christine.largeron, mathias.gery}@univ-st-etienne.fr 2. Université de Lyon UMR 5205 CNRS, LIRIS Email : elod.egyed-zsigmond@insa-lyon.fr • Detection of communities in social networks using the relationships between the actors and attributes describing them. • Construction of a dataset with ground truth: a scientific social network in which textual data is associated to each vertex and the classes are known. • Proposition of different scenarios for the detection of communities and evaluation of the performance. Aim of the work Problemstatement Dataset Let G = (V, E) a graph describing a social network. V is the set of the vertices and E is the set of edges such that each vertex vi V is associated with a vector of textual attributes where wij is the tf-idf weight of the term tj in the document di. The attributed graph clustering problem is to build a partition of V such that: there should be many edges within each cluster and relatively few between the clusters; two vertices belonging to the same cluster are more similar in terms of attributes, than two vertices belonging to two different clusters. • Vertices: 99 authors from 2 conferences: SAC’09 & IJCAI’09. • Edges: co-participation network generated from DBLP. • Vertices document: the abstracts and titles of the articles were extracted from the websites of the conferences. • Ground truths partitions: • PS= A  B,C  D (conferences) • PT= A,B C, D (research areas) • PTS = A,B,C,D (sessions) SAC SAC IJCAI IJCAI A A Bioinformatics Bioinformatics B C B C Robots Robots Linearcombination D D Constraints Constraints TS1: Structure-based clustering on attribute weighted graph TS3: Linearcombination TS2: Attribute-based clustering on structural distance Cosine distance matrix Vertices distance matrix Textual distance (cosine) matrix Textual distance (cosine) matrix Shortestpathdistance matrix Combined distances matrix Shortestpathprocessing Weighted-graphs clusteringalgorithm  1- Information network Graph valuedwithtextual distance Graph valuedwithtextual distance Graph valuedwithtextual distance Information network Results Conclusion and perspectives Accuracycriterion • Simplestmethodswith no parameter tend to achieve a good classification. • Scenarios considerdifferent aspects of data and givedifferentresults. • Properties of the produced classes depend on the scenario (production of disconnected classes…). • Weintend to help choosing the best clustering scenario for a givendataset. Hierarchicalagglomerative clustering Hierarchicalagglomerative clustering Workpartiallyfundedby Rhône-Alpes and Saint-Étienne Métropole. August 2012.

Work partially funded by Rhône-Alpes and Saint-Étienne Métropole. August 2012.

Work partially funded by Rhône-Alpes and Saint-Étienne Métropole. August 2012.

Presentation Transcript

icu-acquired weakness

Tools for High Performance Network Monitoring

TYPICAL OR COMMON LIVESTOCK EQUIPMENT

August 22, 2012

USDA Eligibility Manual for School Meals revised August 2012

GAMES 2012 Annual Convention August 7, 2012

LARGE DIAMETER PVC PIPE SEMINAR FOR DIAMOND PLASTICS Denver, CO August 28-29, 2012

Zarbula’s Sundials

Welcome to Marketworx

Instructional Principals’ Meetings August 9 and 10, 2012

Robert Shaw Memorial, 1884–1897

Day 1 SHINE Program Certification Training

Child Neglect

Georgia School of Addiction Studies August 29, 2012