10 likes | 93 Views
Knowledge Discovery from Mobile Phone Communication Activity Data Streams Fergal Walsh. Data Stream. Anonymised Customer Data Records (CDR) from Meteor, Ireland’s 3 rd largest mobile phone network. More than 1 million customers One record per call/sms sent received
E N D
Knowledge Discovery from Mobile Phone Communication Activity Data Streams Fergal Walsh Data Stream Anonymised Customer Data Records (CDR) from Meteor, Ireland’s 3rd largest mobile phone network More than 1 million customers One record per call/sms sent received About 40 million records per day About 7000 cells (spatial areas) Cell areas range from <1km2 to ~50km2 Records are ordered by time and independent of each other, making this data ideally suited to stream processing Data Exploration Stream Processor Indexed Database Raw CDR Data Exploratory Query Tool Data stream processor for pre-processing each record and computing aggregates Spatial, temporal and user indices for efficient querying 1 week of data (> 200 million records) Web based tool for ad hoc spatio-temporal queries 12:00 18:00 08:00 00:00 Communication event counts per cell per hour (weekday average) Trajectories of 2 sample users Location of caller and callee for 2 sample users Current Work • Information retrieval using stream data mining and machine learning techniques • Find users similar to some example users (classification using Support Vector Machines): • Users who travel from Maynooth to Dublin daily • Users who travel to Dublin from rural areas daily (using semantics of spatial areas) • Groups of users who are planning a meet-up (using communication motifs) • Find areas with similar phone usage activity profiles (clustering) • Nightlife, business, residential, rural • Find clusters of users with similar activity profiles (clustering) • Development of (ncg.nuim.ie/i2maps/) Future Work Publications Learn activity chains (probabilistic models) of each users communication and movement events. These will use semantic labels rather than raw spatial locations. Predict movement and communication events from learned models. Pozdnoukhov A., Walsh F., Exploratory Novelty Identification in Human Activity Data Streams, ACM SIGSPATIAL International Workshop on GeoStreaming at 18th ACM SIGSPATIAL GIS, 2010. Pozdnoukhov A., Walsh F., Kaiser F., Statistical Machine Learning from VGI, Position paper at Role of Volunteered Geographic Information in Advancing Science Workshop at GIScience'10, 2010. Kaiser C., Walsh F., Farmer C. and Pozdnoukhov A., User-centric time-distance representation of road networks. In Springer LNCS proc. of the GIScience'10 (full paper). 2010. Acknowledgements The authors gratefully acknowledge the support of Meteor for providing the data used in this poster, in particular Mr. John Bathe and Mr. Adrian Whitwham. Thanks to Ronan Farrell (IMWS) for obtaining the data from Meteor for StratAG Thanks to John Doyle for providing the cell tessellation used in the examples above. Research presented in this poster was funded by a Strategic Research Cluster Grant (07/SRC/I1168) by Science Foundation Ireland under the National Development Plan. The authors gratefully acknowledge this support.