Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Finding Tribes: Identifying Close-Knit Individuals fromEmployment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei

Introduction • Tribes – groups with similar traits in a large graph • Distinguish those that work together and move together intentionally

Relationship Knowledge Discovery • Exploit connections among individuals to identify patterns and make predictions • Discover underlying dependencies • Links must be inferred

Graph Mining • Discover Hidden Group Structures • Animal Herds, Webpages, Employees • Time Series Analysis • Co-integration (Economics) • Security and Intrusion Detection • Dynamic Networks

Motivation • National Association of Securities Dealers • Fraud • Collusion • 4.8 Million Records • 2.5 Million Reps at 560,000 Firms • 100 Years of Data

Complications • Jobs not necessarily in order (or singletons) • 20% of employees hold more than one job at a time • 10% begin multiple jobs (up to 16) on one day • Leave gaps between employment • Mergers and acquisitions

Model

Finding Anomalously Related Entities • Input: • Bipartite Graph: G = (R  A, E) • Entities: R = {r1, r2, …, rn} (People) • Attributes: A = {a1, a2, …, am} (Orgs.) • Entities should connect several attributes • Model co-occurrence rates of pairs of attributes

Algorithm

Simple Model Measures • JOBS = (Number of shared Jobs in the sequence) • YEARS = (Number of Years of overlap)

Example Sequences

Probabilistic Model • X = P(BrA -> BrB -> BrC -> BrD) • = pa * tAB * tBC * tCD • Estimate: • P(start branch i) • =(#reps ever at i) / (#reps in database) • Tij = P(reps from i to j | #ever at i) • =(#reps leave i to go to j) / (ever at i)

Probabilistic Model • Null Hypothesis of Independent Movement • Movement Not Random • Split and Merge • Markov Chains

Probabilistic Model (Different Paths) • Tij becomes Vij • Vij = P(move to branch j at any point after branch I | currently at i) • = (# reps who go to branch j at any point after working at i) / (# reps ever at i) • Now each vij >= tij and probabilities no longer sum to 1.

Probabilistic Model (Different Paths) • Vij becomes Wij • Wij = P (move to branch j at any point simultaneous to or after branch i | currently at i) • = (# reps who start at j at any point simultaneous or after starting at i) / (# of reps ever at i) • Now less precise in respect to direct transitions but more general

PROB - TIMEBINS • Bins of 1 year or more • 10 people worked at each branch in a bin period • PiX = # reps ever at i during time X / # reps in DB • yiXjY = # reps ever at I during time X and at j during time Y, where Y >= X / # reps ever at i during time X

PROB-NOTIME • Ignores order of job moves • Use original pi • Zij = raw number of reps who are at both branches I and j during career • Transition Pr from i to j: • = (zij / # reps ever at i) • != (zij / # reps ever at j) • =transition Pr from j to i

Tribe Size

Pairs

Commonality of Job Sequence

Disclosure Scores

Homogenaity and Mobility

Discussion • JOBS, PROB, PROB-TIME, PROB-NOTIME create tribes with higher than average disclosure scores • PROB creates more cross zip code results • PROB-TIME has higher phi-squared than all others • PROB favors large firms

Discussion • JOBS and YEARS compute larger connected components • JOBS and PROB find same number of tribes but pick different groups as tribes

Conclusions • With no explicit knowledge we can discover: • Job transitions • Geography • Career track

Conclusions • Needed: • Ongoing process • Multiple affiliations • Arbitrary times • Time is a paradox in domain

Thanks! • Time for: • Questions • Comments • Smart Remarks

Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns

Presentation Transcript

Fiscal Year Close for the Beginner

On Finding Repeats in Strings

The Tribes of Israel

Root finding Methods

Fact-finding Techniques

Employment Law Update

I’m a Suit in a Cyber World!

Movie Jeopardy

Patterns of Evolution

CLOSE PROJECT OR PHASE

Design patterns

fractals and patterns

How close is close enough?

ECE450S – Software Engineering II

Motif Finding

Data Mining Tutorial

Finding patterns in large, real networks

Design Patterns

Design patterns

Search Patterns

Session #43

Patterns for the People