260 likes | 374 Views
Landmark-Based User Location Inference in Social Media. Yuto Yamaguchi † , Toshiyuki Amagasa † and Hiroyuki Kitagawa † †University of Tsukuba. location-related information. Profile. Residence: Tokyo, Japan. Eating seafood !!! . I’m at Logan airport . COSN @ northeastern .
E N D
Landmark-Based User Location Inferencein Social Media Yuto Yamaguchi†, Toshiyuki Amagasa† and Hiroyuki Kitagawa† †University of Tsukuba COSN 2013 - Yuto Yamaguchi
location-related information Profile Residence: Tokyo, Japan Eating seafood !!! I’m at Logan airport COSN @ northeastern COSN 2013 - Yuto Yamaguchi
applications • Various Researches using Home Locations • Outbreak Modeling [Poul+, ICWSM’12] • Real-World Event Detection [Sakaki+, WWW’12] • Analyzing Disasters [Mandel+, LSM’12] • Other Useful Applications • Location-aware Recommender [Levandoski+, ICDE’12] • Merketing, Ads • Disaster Warning COSN 2013 - Yuto Yamaguchi
our Problem • Location profiles are not available for … • 76% of Twitter users [Cheng et al., CIKM’10] • 94% of Facebook users [Backstrom et al., WWW’10] • This reduces opportunities of location information • User Home Location Inference COSN 2013 - Yuto Yamaguchi
User home location inference • Content-Based Approaches • [Cheng et al., CIKM’10] • [Kinsella et al., SMUC’11] • [Chandra et al., SocialCom’11] • Graph-Based Approaches • [Backstrom et al., WWW’10] • [Sadilek et al., WSDM’12] • [Jurgens, ICWSM’13] Our focus COSN 2013 - Yuto Yamaguchi
graph-based approach (1/2) • Basic Idea Boston friends Boston New York Boston? Chicago Boston COSN 2013 - Yuto Yamaguchi
graph-based approach (2/2) • Closeness Assumption Friends 60% are 100km distant Spatially close Not friends Really close? Spatially distant COSN 2013 - Yuto Yamaguchi
concentration assumption LANDMARK Boston? Unknown Chicago NY Boston COSN 2013 - Yuto Yamaguchi
landmarks COSN 2013 - Yuto Yamaguchi
requirements • Small Dispersion • Large Centrality COSN 2013 - Yuto Yamaguchi
examples in twitter COSN 2013 - Yuto Yamaguchi
Landmarks mapping Red: all users Blue: landmarks COSN 2013 - Yuto Yamaguchi
proposed method COSN 2013 - Yuto Yamaguchi
Overview • Probabilistic Model • Modeling • Each user has his/her location distribution • Location inference = • Selecting the location with • the largest probability density location set LANDMARK MIXTURE MODEL COSN 2013 - Yuto Yamaguchi
Dominance Distribution • Spatial distribution of followers’ home locations • Modeled as Gaussian • Landmarks have small covariances • many followers at the center many followers few followers longitude latitude COSN 2013 - Yuto Yamaguchi
Landmark Mixture Model (LMM) Dominance distribution Mixture weight Landmark follow Non-landmark Inference target user Non-landmark Large weight for landmark COSN 2013 - Yuto Yamaguchi
mixture weights Proportional to centrality Landmark Non-landmark Large mixture weight Small mixture weight COSN 2013 - Yuto Yamaguchi
Confidence Constraint • If the distribution does not have a clear peak, • we should not infer the location of that user High precision butlow recall COSN 2013 - Yuto Yamaguchi
Centrality Constraint • We can reduce the cost by ignoring non-landmarks Landmark follow Non-landmark Inference target user Non-landmark low cost butlow recall COSN 2013 - Yuto Yamaguchi
experiments COSN 2013 - Yuto Yamaguchi
Dataset • Twitter dataset provided by [Li et al., KDD’12] • 3M users in the U.S. • 285M follow edges • Geocode their location profiles for ground truth • 465K users (15%) labeled users • Test set • 46K users (10% of labeled users) COSN 2013 - Yuto Yamaguchi
performance comparison • Compared three methods • LMM: our method • UDI: [Li+, KDD’12] • Naïve: Spatial median COSN 2013 - Yuto Yamaguchi
effect of Confidence constraint p0 We can adjust the trade-off between precision and recall COSN 2013 - Yuto Yamaguchi
effect of centrality constraint c0 We can adjust the trade-off between cost and recall COSN 2013 - Yuto Yamaguchi
Conclusion • Introduced the concentration assumptioninstead of widely-used closeness assumption • There exist landmarks • Proposed landmark mixture model • Outperforms the state-of-the-art method • Confidence / Centrality constraint • Future work • Other application of landmarks • Recommending landmarks or their tweets COSN 2013 - Yuto Yamaguchi