330 likes | 406 Views
Smart Itinerary Recommendation based on User-Generated GPS Trajectories. Hyoseok Yoon 1 , Y. Zheng 2 , X. Xie 2 and W. Woo 1. 1 GIST U-VR Lab. 2 Microsoft Research Asia. 1. Traveling. Popular leisure activity. How to use time wisely?. Trial-and-error is COSTLY!!!.
E N D
Smart Itinerary Recommendation based on User-Generated GPS Trajectories Hyoseok Yoon1, Y. Zheng2, X. Xie2 and W. Woo1 1GIST U-VR Lab. 2Microsoft Research Asia 1
Traveling • Popular leisure activity How to use time wisely? Trial-and-error is COSTLY!!! <Source: Flickr, Photo By Wolfgang Staudt>
Commercial Solution • Handful itineraries • Major location • Fixed time • Not flexible <Source: Flickr, Photo By Andrew. O>
Social Solution • Ask residents of the region • Refer to travel experts • Learn from the experienced <Source: Flickr, Photo By Supermariolxpt>
Introduction • Data mining of GPS trajectories • User-generated • Travel routes • Travel experiences • Itinerary recommendation
Related Work • Itinerary Recommendation • Interactive system for manually generate itinerary • INTRIGUE, TripTip • Travel recommendation system based on online travel info. (Huang and Bian) • Advanced Traveler Information System based on the shortest distance • GPS Data Mining Applications • Finding patterns in GPS trajectory • Find locations of interest • GeoLife: mine user similarity, interest locations, and travel sequences
Contributions • BuildLocation-Interest Graph • From multiple user-generated GPS trajectories • For modeling travel routes • Definea good itinerary • How to define and model itinerary • How it can be evaluated • Smart itinerary recommendation framework • Recommend highly efficient and balanced itinerary • Evaluation • Using a large GPS dataset • Simulated/real user queries
Preliminaries • Trajectory: a sequence of time-stamped points • Stay Point: a geographical region s • Where a user stayed over a time threshold within a distance threshold
Preliminaries • Location History: A sequence of stay points user visited • Locations: Clustersof stay points detected from multiple users’ trajectories • Substitute a stay point in with the Location ID the stay point pertains to Location s s s s s s s s s s
Preliminaries • Typical Stay Time: Defined as median of stay time of stay points in li • Typical Time Interval (∆Ti,j): Traveling time between location li to lj Location Location Location s s s s s s s s s s s s s s s s s s s s s s
Preliminaries • Location Interest • The interest of a location is represented by authority scores (HITS-based inference model)* • User Experience as Hub • Locations as Authority *Zheng, Y., Zhang, L., Xie, X., Ma, W.Y.: Mining Correlation Between Locations Using Human Location History, In: GIS 2009, pp. 472-475 (2009)
Preliminaries • Trip: A sequence of locations with corresponding typical time intervals • Itinerary: A recommended trip based on user query Q • User Query: A user-specified input (start point, end point and duration)
Modeling Itinerary • Duration as the constraint • Duration that exceeds user’s requirement • No use to users • Simplifies algorithmic complexity • Provides a stopping condition
Modeling Itinerary • First three factors to find candidate trips • (1) Elapsed Time Ratio • (2) Stay Time Ratio • (3) Interest Density Ratio • Classical travel sequence to differentiate candidates further • (4) Classical Travel Sequence Ratio
Architecture • Offline • Analyze collected GPS trajectories • Build a Location-Interest Graph (Gr) • Online • Use Gr to recommend an itinerary based on user query
Location-Interest Graph • Location-Interest Graph • (1) Detect stay points • (2) Cluster them into locations • (3) Calculate location interest • (4) Compute classical travel sequence* • We build Gr offline which contains info. on • Location itself • interest, typical staying time • Relationship between locations • Typical traveling time, classical travel sequence *Zheng, Y., Zhang, L., Xie, X., Ma, W.Y.: Mining Interesting Locations and Travel Sequences from GPS Trajectories. In: WWW 2009, pp. 791-800 (2009)
Query Verification • In the online process, user query Q needs to be verified by calculating Dist(qs,qd) • (1) Using GPS coordinates • Harversine formula or the spherical law of cosines • (2) Use Web service such as Bing Map • If the query is reasonable • Substitute start point and the end point with the nearest locations in Gr • Send an updated query Q` = {ls,ld,qt} to recommender
Trip Candidate Selection • Select trip candidates from the starting location ls to the end location ld. • Candidate trips do not exceed the given duration qt. • (1) start by adding ls to the trip • (2) Add next feasible location not in the trip • (3) Update time parameter • (4) Repeat until the end location is reached or no more location can be added
Trip Candidate Ranking • Top-k trips in the order of the Euclidean Distance of (Elapsed Time Ratio, Stay Time Ratio, Interest Density Ratio)
Re-ranking by Travel Sequence • Differentiate candidates further with classical travel sequence to consider • Authority score of going in and out and the hub scores • Re-rank with CTSR
Illustrative Example 2H 40M 1H 30M 1.5H 1H 1H 30M 1H
Experiments • Settings • GPS trajectories collected from 125 users • 17,745GPS trajectories (May. 2007 ~ Aug. 2009 in Beijing) • Time threshold Tr (20 min), distance threshold Dr (200 meters) • 35,319 stay points are detected excluding work/home spots • Density-based clustering algorithm OPTICS to result in 119 location
Experiments • Two evaluation approach • (1) Simulated user queries • Algorithmic level comparison • Compare quality with baselines • (2) User study with local residents • How user’s perceived quality of itineraries compare by different methods
Experiments • Simulation • Four different levels for duration (5, 10,15, 20 hours) • For each level, 1,000 queries are generated • User Study • 10 active residents of Beijing (avg: 3.8 years) • Submitted 3 queries and score 3 itineraries generated by our method and two baselines (3x3).
Evaluation (Baselines) • Ranking-by-Time (RbT) • Recommend an itinerary with the highest elapsed time usage • Ranking-by-Interest (RbI) • Ranks the candidates in the order of total interest of locations included in the itinerary
Results • In 5hr level, • All three produce similar quality results • There are not many candidates and they would overlap anyway
Results • In 10hr-20hr level • Baseline algorithms only perform well in one aspect • Our algorithm produces well-balanced and classical sequence is considered
Results • In 10hr-20hr level • Baseline algorithms only perform well in one aspect • Our algorithm produces well-balanced and classical sequence is considered
Results • In 10hr-20hr level • Baseline algorithms only perform well in one aspect • Our algorithm produces well-balanced and classical sequence is considered
Results • In 5hr level, • All three produce similar quality results • There are not many candidates and they would overlap anyway • In 10hr-20hr level • Baseline algorithms only perform well in one aspect • Our algorithm produces well-balanced and classical sequence is considered
Results • How does our method compare to RbT in terms of perceived time use? • How does our method compare to RbI in terms of perceived interest? • No significant advantage from RbT in perceived time or RbI in perceived interest Our method is well balanced and competitive
Conclusion • Based on user-generated GPS trajectories • Build Location-Interest Graph • Model and define good itinerary • Recommend itinerary based on user query • Find candidates and rank considering three factors (Elapsed time, stay time and interest density) • Re-rank with classical travel sequence • Evaluated with real and simulated user query • Future Work • Personalized recommendation using user preference
15th CTI Workshop, July 26, 2008 Context-Aware Mobile Augmented Reality Discussions and More information • GIST U-VR Lab, Gwangju 500-712, Korea • E-Mail: hyoon@gist.ac.kr • Web: http://wiki.uvr.gist.ac.kr/Main/HyoseokYoon