160 likes | 174 Views
Explore the use of mobile phone data for tourism and transport statistics, including better quality data through signaling information. Learn about the project, data sources, and some initial results.
E N D
Session 6A - Mobile phone data for mobility statistics: using mobile phone data for statistics on tourism and transportBetter quality of mobile phone data based statistics through the use of signalling information – the case of tourism statistics New Techniques and Technologies for Statistics NTTS 2017 – 14-16 March 2017 – Brussels (BE) Christophe Demunter (European Commission / Eurostat) Gerdy Seynaeve (Proximus)
outline 1 - the project 2 - data sources 3 - some results 4 - wrapping up
Prehistory 1 - the project • Feasibility study on the use of mobile positioning data for tourism statistics (Eurostat) – 2012-2014 • Creation of Task Force on Big Data - 2013
Research team 1 - the project • Partnership between statistical offices and mobile network operator (MNO) • Eurostat • Proximus • Statistics Belgium
Objectives 1 - the project • Explore partnerships and business models for cooperation between MNOs and NSIs • Access • Exploring, testing, pilots • Regular data processing and production • Continuity, viability of cooperation • Cooperate on concrete, output-oriented projects • Population statistics (present/resident population) • Tourism statistics • …
NTTS 2017 papers 1 - the project Land use classification based on present population daily profiles from a big data source [5A-001] • Session 5A: Using mobile phone data for official statistics on land use, urban areas and dwellings Better quality of mobile phone data based statistics through the use of signalling information – the case of tourism statistics [6A-001] • Session 6A - Mobile phone data for mobility statistics: using mobile phone data for statistics on tourism and transport Official statistics and mobile network operators: a business model for partnerships [8C-003] • Session 8C: New organisation and collaboration approaches to foster innovation in official statistics…
Source 1: mobile phone data 2 – data sources • Data from one operator in Belgium: PROXIMUS • Signalling data (no longer on call detail records) • 10x more observations than CDR [on the home network] • superior for monitoring day-to-day mobility • position of devices detected every 3 hrs(if not switched off) • minimum every hour for devices with data 'on' • but through active usage for calls, messages, data, the detection frequency is much higher: • 70% of devices detected within one hour, • 35% within 15 minutes (daytime) • Better temporal (and geographical) granularity
Source 2: official tourism statistics 2 – data sources • Survey based data on trips made by residents of Belgium • In line with EURegulation 692/2011 concerning European statistics on tourism • Quarterly interviews, annual sample ± 10000 trips (domestic + outbound trips with overnight stays) • Relatively low response rate (unit non-response ≈ 85%)
Known weaknesses 2 – data sources
Scope (first stage of the project) 2 – data sources • Focus on outbound trips with at least 1 overnight stay • Mobile phone data: trips made April – Sept 2016 • Official tourism statistics: trips made April – Sept 2015 • Definition of an outbound trip • From leaving the home network to returning • Number of nights: number of hours divided by 24 • Overnight stay: minimum 10 hours and return after 4am • Usual environment (= key concept in tourism statistics) • Duration (min. 10hrs + incl. 4am), border crossing (outbound) • Filtering of frequent trips to the same destination during a given reference period (250 days) threshold = 5 (arbitrary)
Ranking of tourism destinations 3 – some results Ranking of EU-28 countries as destination for Belgian outbound trips (MNO data)
Outbound trips by duration: comparison 3 – some results Comparison of the distribution of outbound trips to EU-28 and to Italy, by duration of the trips
Volume of trips and nights: comparison 3 – some results Comparison of estimated number of outbound trips, by destination
Volume of trips and nights: comparison (2) 3 – some results • Observations • Big difference between MNO and NSI estimates • Systematic nature • Understanding (and solving…) the deviations • Difference in scope (e.g. age limit) & reference year • Selectivity bias and impact on extrapolations • Intermediate results, model & algorithm optimisation ongoing (e.g. parameter setting for 'usual environment') • Dealing with recall bias in survey based data (15-20%) • Impact of non-response in surveys (structural bias?) • The project continues …
First lessons learnt 4 – wrapping up • Positive & fruitful experience with the partnership • Joining forces (statisticians, data holders, data scientists) • Search for a win-win • Promising results, but lots of homework • The data makes sense : mobile phone data clearly captures tourism concepts/definitions • Currently: satisfactory for trends, not for estimating volumes • How to make the series/sources converge to the unknown true values? • Extension to domestic tourism, to same-day visits • Further research to be encouraged (other countries?)
Thanks for your attention christophe.demunter@ec.europa.eu gerdy.seynaeve@proximus.com