Advancing Information Flow with Transfer Learning in Organizations

Transfer Learning for Enhancing Information Flow in Organizations and Social Networks Chris Pal Xuerui Wang & Andrew McCallum University of Massachusetts, Amherst

Summary • New Topic Models, Start Simple & Build- Compare with related model structures- Precision vs. Recall 20 Newsgroups • Add Authors + Discriminative Methods- Predict NIPS Authors & Email Recipients • Authors + Recipients & Creating (DARTs)- Transfer Learning in Social Networks- Experiments with Enron Email

New Continuous Topic Models • Undirected (Random Field) Joint Model • Conditionally log-Normal Topics • Conditionally Multinomial Words Contrast w/ LDA Plate Notation Nt topics Nw words

Further Contrast - MCA, PCA, RAP • Multinomial Component Analysis (MCA) • Principal Component Analysis (PCA) • Rate Adapting Poisson (RAP) Model MCA PCA RAP Nz unobserved, Gaussian variables Nb binary topics Nv Poisson counts for each word in vocabulary Nx observed, Gaussian variables, fixed dimension Nw draws from a discrete distribution, (words in doc)

Our Model (MCA) vs. TFIDF vs. RAP MRR Method .45Our Model.37TFIDF.33RAP • Precision vs. Recall on 20 Newsgroups, 100 word vocabulary • 20 dimensional hidden topic space • Cosine Distance Comparisons (.9, .1 – Train, Test Split) • Compared with TFIDF and Rate Adapting Poisson (RAP) Model

20 Newsgroups • 10,000 word vocab. - highest MI with class • 18,796 documents • Downcased, no stopwords, porter stemmed • comp., rec., sci., .forsale, .politics, .religion NIPS • 13,649 word vocab. • 1,740 papers • Downcased, no stopwords, no stemming • 13 years of NIPS proceedings 1987-1999

20 Newsgroups Topics

NIPS Topics

Discriminative Training, MCL and a Richer Model Maximum Likelihood Discriminative Training Nt topics Nw words ‘Multi-conditional’ Training A Richer Model Discriminative Training Nw authors, year, Nw words

The Main Equations • The conditionals for Gibbs sampling • Optimize the marginal or marg. conditionals • Optimize the marginal or marg. conditionals

NIPS TopicsMulti-conditional Learning Optimize an objective based on the product of the conditional probability for one word given all the others.

Predicting NIPS Authors • Comparing Models, Mean Reciprocal Rank (MRR) • Cosine Distance Comparisons (.9, .1 – Train, Test Split) MRR Method .88 Discriminative .46Joint.25Joint, Words only

NIPS Topics + Authors

NIPS Papers (For Context)

Academic Email • 4,643 emails • 190 recipients • 8,693 word vocabulary • Downcased, no stopwords, no stemming Mean Reciprocal Rank (MRR) Evaluation Reciprocal of the rank at which the first relevant response was returnedMethod 1: Use the cosine of all previous sent email, obtain authors from ordered closest matchMethod 2: Use model to make predictions obtain ordered list from probability distribution

Academic Email Topics

Predicting Email Recipients • Comparing Models, Mean Reciprocal Rank (MRR) • Cosine Distance Comparisons (.9, .1 – Train, Test Split) • 20 dimensional hidden ‘topic’ space MRR Method .60 Discriminative .30Joint.21Joint, Words only

Summary of Results so Far • Richer model with authors included helps • Discriminative optimization helps a lot

Undirected, Continuous Author Recipient Topic Models Nt topics • A continuous topic model • Author recipient topic model • Plated version of same model Nw words Author, Nr Recipients, Nw words Plate Notation

Enron Email • 150 employees • 250,000 emails • Avg. of 1400 sent emails [200 – 4800] • Experiments with .9, .1 test-train split • Use model to make prediction & cosine method • Explore two types of transfer learning: 1. Shared hidden variables2. Group and local models & coupled parameters

1. Transfer Using Shared Topics MRR Method .68 Transfer DART.62TFIDF • Use model with shared latent space for predictions

Discriminative Author Recipient Topic (DART) Model Directed, ART Model (Discrete) Undirected, Continuous Topic DART Model

Transfer Learning with DARTs … 2. Adapt DART to a given users email 1. Train DART on orgs entire email 3. Major advantage for new users

2. Transfer Parameters & Adapt • Topics with Transfer vs.No Transfer

Transfer Parameters & Adapt • 200 Topic Models, Transfer vs. No Transfer

Summary, Conclusions Discussion • New, rich topic models for text & attributes • Discriminative methods - dramatic increase in task performance • Two types of transfer learning- Each leverage social / org. networks • Dramatic benefit for a new model/userQuestion: Can similar users be identified for more sophisticated transfer? • Practical Issues: Information Sharing etc.

Advancing Information Flow with Transfer Learning in Organizations

Advancing Information Flow with Transfer Learning in Organizations

Presentation Transcript

Enhancing Learning with Social Software

Community Structure in Large Social and Information Networks

Learning Influence Probabilities in Social Networks

Information Flow in Networks: Beyond Network Coding

Perception and Learning in Organizations

Information and coordination in Organizations

Enhancing and Identifying Cloning Attacks in Online Social Networks

Social networks and information sharing

Networks and Organizations

Learning Influence Probabilities in Social Networks

Information spread in social networks

L eadership Buy-in for Driving Social-learning in Organizations

Social learning networks for collaboration

Enhancing O 2 Transfer in Subsurface-flow Constructed Wetlands

Social Networks as Learning Networks

On Flow Authority Discovery in Social Networks

SOCIAL NETWORKS OF PEOPLE IN ORGANIZATIONS

Learning Influence Probabilities in Social Networks

Perception and Learning in Organizations

Providing Learning in Social Networks

Information Spread and Information Maximization in Social Networks