230 likes | 310 Views
Extracting and Utilizing Social Networks from Log Files of Shared Workspaces. Peyman Nasirifard, Vassilios Peristeras, Conor Hayes and Stefan Decker 10th IFIP Working Conference on VIRTUAL ENTERPRISES Thessaloniki, Greece, 7-9 October 2009. Outline. Introduction and Problem Definition
E N D
Extracting and Utilizing Social Networks from Log Files of Shared Workspaces Peyman Nasirifard, Vassilios Peristeras, Conor Hayes and Stefan Decker 10th IFIP Working Conference on VIRTUAL ENTERPRISES Thessaloniki, Greece, 7-9 October 2009
Outline • Introduction and Problem Definition • Object-centric social network for extracting expertise • User-centric social network for calculating the coperation index • Prototypes • Expert Finder • Holmes • Evaluation • Conclusion • Q and A
Introduction and Problem Definition • Online Shared workspaces provide various services for online collaboration • BSCW, SharePoint • Difficult to find people with appropriate expertise in intra- and inter-organizations settings • People do not update their profiles regularly • Difficult to spot „who works with whom“ or „who the senior within a community is“ • People do not maintain their social networks frequently
Problem Definition • To find people with specific expertise • To understand who works with whom and to what extend
Our approach We use: • Log files from CWEs • Social Network Analysis • Semantic technologies (RDF) to represent the extracted Social Network
Social Network Analyis • Social Network Analysis has a lot of potential • Overt and Latent social networks exist among professionals • Online social networks can be divided into two main types • Object-centric (e.g., based on videos, music) • User-centric • We use both types in our work • We use object-centric SN for extracing expertise • We use user-centric SN for calculating cooperation index • Cooperation index: an index that determines how close two people work together
Log files • Log files of shared workspaces contain rich information and can be further analyzed • A log record contains at minimum Subject (e.g., user), Object (e.g., document) and Action/Verb (e.g., read, revise) • Person with ID 123 revised the document with ID 456 • We use these three elements to generate RDF triples for processing
Object-centric Social Networks for extracing expertise
Finding Experts • First step: Key-phrase Extraction • Documents are analysed based on NLP techniques to identify phrases that occur frequently • Second step: Log File Analysis • To identify the documents a user interacts with and how • Third step: Assigning Expertise • A user is expert in topic X, if s/he created or revised a document that contains topic X. • A user is familiar with topic Y, if s/he just read a document that contains topic Y.
From Object-centric to User-centric Action Relationship
Assigning weights to social networks • First step: Build user-centric social network • Previous slide • Depth is also considered (e.g., Depth one means just one document connects two persons) • Second step: Assign weights to relationships • User-defined weights with default values (e.g. Read-Read is low-weighted relationship, create-create high-weighted) • Third step: Calculate cooperation index • Sum up the weights
Prototypes • Expert Finder • http://purl.oclc.org/projects/expertui • Holmes (Cooperation Index calculator) • http://purl.oclc.org/projects/holmes • The prototypes are SOA-based • The prototypes use the BSCW shared workspace • The prototypes use log files of BSCW and in particular the Ecospace project in the period of three years • Around 183 users extracted from log file and some thousands of events • Expert Finder uses around 50 deliverables of Ecospace project
Evaluation with 12 participants • We asked people to take a look at their cooperation indices • All participants confirmed that the presented results were relevant to them • Currently, we considered four main document events (i.e., Create, Revise, Delete, and Read) and only relationships at a depth of one. These events can be simply extended to cover more document events as well as deeper depths. • Combining events and assigning weights to them can bring overhead for users. • In a more complex model for calculating Cooperation Indices, different weights can be posed to documents based on their importance for the collaboration process.
Tools and technology overview • Social Network Analysis • Log files from CWEs • NLP techniques for Phrase Extraction • RDF for representing object-centric and user-centric Social Networks • Web Services for exposing functionalities
Conclusion and Future Work • We presented our approach for extracting expertise from online shared workspaces • We also presented our approach for calculating an index that determines how close two people worked together in the past • Addressing the points (and shortages) mentioned in the evaluation is one of our future directions • Using temporal aspects of log file is another future directions • Calculating cooperation index in a period of time
Thank You! Q and A