1 / 16

Mining Event Periodicity from Incomplete Observations

Mining Event Periodicity from Incomplete Observations. Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at Penn State University. KDD 2012 Beijing, China. Prologue: Detect Periodicity in Movements [Li et al., KDD’10].

walda
Download Presentation

Mining Event Periodicity from Incomplete Observations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Event Periodicity from Incomplete Observations Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at Penn State University KDD 2012 Beijing, China Zhenhui Jessie Li

  2. Prologue: Detect Periodicity in Movements [Li et al., KDD’10] Problem: What is the periodicity of the movement? Bee example: 8 hours in hive 16 hours fly nearby Zhenhui Jessie Li

  3. Prologue: Detect Periodicity in Movements [Li et al., KDD’10] Observe the in-and-out movements from the reference spot (i.e., hive). Easy to see the periodicity. in hive outside hive time Two-Dimensional Movement One-Dimensional Binary Sequence Zhenhui Jessie Li

  4. Challenge: Periodicity Detection for Incomplete Observations • Two factors result in incomplete observations: inconsistent+ lowsampling rate • Movement data collection in real scenarios: • Human movementsdata collected from cellphones: only report locations when making calls • Animal movement data: 2~3 locations in 3~5 days 2009-05-02 01:03 in 2009-05-03 11:30 out 2009-05-05 03:12 in 2009-05-09 12:03 in 2009-05-10 11:14 out 2009-05-11 02:15 in … in hive Complete Observations Incomplete Observations outside hive Zhenhui Jessie Li

  5. A Challenging Case of Detecting Periodicity for Incomplete Observations Sparse Raw Data 2009-05-02 01:03 in 2009-05-03 11:30 out 2009-05-05 03:12 in 2009-05-09 12:03 in 2009-05-10 11:14 out 2009-05-11 02:15 in … in out in Any periodicity in the above sequence? Zhenhui Jessie Li

  6. Mining Periodicity in Incomplete Data • Event has a period of 20 • Occurrences of the event happen between 20k+5 to 20k+10 Zhenhui Jessie Li

  7. A Probabilistic Model for Periodic Event • Example: • Human daily periodicity visiting office • Period as 24 • Visiting office at 10-11am, 14-16pm Zhenhui Jessie Li

  8. A Probabilistic Model for Periodic Event with Random Observation generate x(62)=0 x(5)=1 Zhenhui Jessie Li

  9. Periodicity Detection by Overlaying Observations True period Wrong period Even distribution Skewed distribution Zhenhui Jessie Li

  10. Relationship between Observation Ratio and Probabilistic Model Pos/Neg Ratio Periodic Distribution Vector Zhenhui Jessie Li

  11. Discrepancy Score to Measure Periodicity If T (=24) is the correct period, the discrepancy score should be largefor certainset of timestamps If T (=23) is the wrong period, the discrepancy scores are likely to be zerofor anyset of timestamps Zhenhui Jessie Li

  12. Periodicity Measure Zhenhui Jessie Li

  13. Performance Comparisons Sampling rate (Ratio of observed points in the complete sequence) Zhenhui Jessie Li

  14. Experiment on Real Human Data One person’s visits to a specific location Sampling rate: 20min Sampling rate: 1hour Zhenhui Jessie Li

  15. Problems with Using Fourier Transform to Detect Periodicity T=4 T=16 Zhenhui Jessie Li

  16. Summary: Mining Event Periodicity from Incomplete Observations • Motivation • Challenge of the real data: incomplete observations (inconsistent + low sampling rate) • Method • Overlay the segments and measure the “skewness” of the distribution • Theoretically prove the correctness of the method • Application • Location prediction • 2nd place in Nokia Mobile Data Challenge 2012 • Periodicity-based feature + SVM Thanks! Questions? Zhenhui Jessie Li

More Related