220 likes | 241 Views
El valor de la información: el reto del Big Data Instituto de Estadística y Cartografia de Andalucia 5 Feb 2016. Big data in official statistics in the European Statistical System: the Big Data Action Plan & Roadmap. EUROSTAT – Fernando Reis – 'Task Force Big Data'. Datafication.
E N D
El valor de la información: el reto del Big Data Instituto de Estadística y Cartografia de Andalucia 5 Feb 2016 Big data in official statistics in the European Statistical System: the Big Data Action Plan & Roadmap EUROSTAT – Fernando Reis – 'Task Force Big Data'
Datafication Sensors Digital footprint
Big Data and Official Statistics What will be the impact of ubiquitous data collection and networking • Mobile Communication • Internet of [every]Things, • Social media, • Wearables, • Autonomous traffic, • Smart systems, • … on official statistics?
Expected benefits of using big data ? Outward-looking • More adequate and flexible response to user needs • Wider range of statistical products and services(without increasing burden) • Better understand quality aspects of new sources Inward-looking • Acquisition of new competences for NSIs • Increase efficiency in producing statistics • We remain key players for statistical information
Big data at Eurostat – key points ESS (European Statistical System) • Scheveningen Memorandum Sept 2013 • Examine the potential of big data sources for official statistics • Official Statistics big data strategy as part of wider government strategy • Address privacy and data protection • Collaboration at European and global level • Address need for skills • Partnerships between different stakeholders (government, academics, private sector) • Developments in methodology, quality assessment and IT • Adopt action plan and roadmap for the ESS
Big data at Eurostat – key points ESS (European Statistical System) • Scheveningen Memorandum Sep 2013 • Task Force Big Data • Big Data Roadmap and Action Plan 1.0 June 2014 ESS Pilots 2016 - 2020 • Implementation of ESS Vision 2020: Big Data project = integral part of the portfolio European Commission Communication • "Towards a thriving data driven economy" • Private Public Partnership on big data International cooperation (UNSD, UNECE, etc.) • UN/ECE project “Big data in official statistics” (Sandbox) • UNSD Global WG on Big Data
Big Data Action Plan and Roadmap@ a glance Governance Policy Quality Skills Experience sharing Legislation IT Infrastructures Methods Ethics / Communication Big data sources Pilots
Challenges • cooperation, sharing of know-how • development of a sound methodology ("from design-based to model-based approach") • exploration & tentative implementation • Looking for partners Governance Policy Quality Skills Experience sharing Legislation IT Infrastructures Methods Ethics / Communication Big data sources Pilots Action (example) • Pilot projects, carried out by the Member States (ESSnet) • 2015 – 2019 (FPA / SGA construction) • Exploring different big data sources (but also IT architecture, partnerships), developing generic guidelines and frameworks • Establish Parternships with data providers and research and international organisations • Cooperation with UN (lead) on Metodological Framework
Governance Policy Quality Skills Experience sharing Legislation IT Infrastructures Methods Ethics / Communication Big data sources Pilots Action (example) – continued • List of pilot projects (Frame Partnership Agreement signed) • Web scraping [job vacancies ; enterprise characteristics] • Smart meters [electricity consumption ; temporary vacant dwellings] • AIS data [vessel identification systems] • Mobile phone data • “The big data for official statistics competition" (2016)
Challenges • new skills for NSI staff: statisticians vs. data scientists ? • computing capacity, hardware ? • analytical tools, software? • storage ? Governance Policy Quality Skills Experience sharing Legislation IT Infrastructures Methods Ethics / Communication Big data sources Pilots Action (example) • Training program for European statisticians(ESTP) • In the next years: dedicated courses on big data • Focus on big data sources and on big data tools • Acquiring the skills needed to assess sources and their quality, the skills to use tools and to explore big data sources
ESTP courses supporting big data (2016) 12 – 15 Sep Big data sources - Web, Social media and text analytics 29 Feb – 2 Mar 21 – 24 Jun Introduction to big data and its tools Hands-on immersion on big data tools Nowcasting 7 –10 Nov Advanced big data sources - Mobile phone and other sensors 5 – 7 Apr 8– 10 Jun 24 – 26 Feb The use of R in official statistics: model based estimates Can a statistician become a data scientist? Time-series econometrics Methodology courses Activity Big data courses
Challenges • integrating official statistics in big data strategies • getting access to data & continuity of access • data security & privacy concerns • compensate for the burden ? Governance Policy Quality Skills Experience sharing Legislation IT Infrastructures Methods Ethics / Communication Big data sources Pilots Action (example) • Project on the analysis of legislation and strategy (but also ethics and communication) • 2015-2017 (22 months) • Analysis for EU and for Member States at national level • See also the Feasibility study on the use of mobile positioning data for tourism statistics (report on feasibility of access)
Challenges • transversal challenges to all big data activities: quality and ethics & communication • big data vs. statistics : "goodness of fit" (concepts, representativeness,…) • impact on the public opinion of privacy and security concerns ? Governance Policy Quality Skills Experience sharing Legislation IT Infrastructures Methods Ethics / Communication Big data sources Pilots Action (example) • Cooperation with UN (lead) on a quality framework for big data • Project on the analysis of ethics and communication (but also legislation and strategy) • 2015-2017 (22 months) • Analysis for EU and for Member States at national level
Currently a focal data source for big data • Exists in all countries • (≠ accessible in all countries) • Many promising studies/experiments available • Potential relevance to many areas of official statistics (synergies!) • Most available studies linking big data to tourism statistics, are based on mobile phone data
Mobile phone data Eurostat: • Feasibility study on the use of mobile positioning data for tourism statistics (2012-2014) • Included in the forthcoming ESS Pilots on Big Data (2016-2019) • GWG Big Data Pilot NSIs (and tourism researchers) • Many small or larger scale projects ongoing! • GWG Big Data Task Team Mobile Phone Data
… slow data vs. quick data… Article released one day after 2015 Easter weekend about tourism in Belgian coast: 150 000 same-day visitors on Sunday, 400 000 during the entire long weekend • Data based on a monitoring of the regional tourism board, in cooperation with the main mobile network operator Proximus and the road infrastructure administration; • In comparison: Eurostat will receive data on same-day visitors for the 2nd quarter of 2015 (not a particular weekend) on 30 June 2016 (not the day after) for the entire country (not a coastal strip within a NUTS2 region); • Methodology not clear, but it's a nice example of how flash estimates based on big data decreases the relevance of official statistics.
Big data = Multiple sources & Multiple outputs
Lifecycle for the coming years ? Mobile phone data HOUSEHOLD & BUSINESS SURVEYS Domain STATISTICS Payment cards data Other big data • SHORT TERM • 'Traditional' surveys as main input for tourism statistics • Big data sources slowly becoming auxiliary information
Lifecycle for the coming years ? (2) Mobile phone data HOUSEHOLD & BUSINESS SURVEYS Domain STATISTICS Other big data Payment cards data • MID TERM • Weight of surveys decreases in favour of big data ? • Surveys no longer 'main filter' but 'one of the sources' ?
Lifecycle for the coming years ? (3) Mobile phone data HOUSEHOLD & BUSINESS SURVEYS Domain STATISTICS Payment cards data Other big data NEW Web (prices) • LONGER TERM • 'Replacement of surveys continues (smaller samples, less frequent collection) ? • Enhanced tourism statistics via embedding of newer sources ? Bookings (nowcast/forecast)
The statistical office of the future • Data flows in addition to surveys and censuses • Embedded in data flow – smart statistics • Product designers in addition to data collection designers • Statistical modelling will be a major activity • From descriptive indicators to nowcasting (and forecasting) • Trust and quality will be key • New role in teaching digital literacy • Accreditation and certification instead of pure production • Address issues linked to quality & transparency, privacy & confidentiality, access to third party data sources & data sharing, scientific standards & methodology, professional ethics, skills, …
Thank you for your attention Fernando Reis Eurostat Task Force on Big Data fernando.reis@ec.europa.eu https://github.com/reisfe/ https://twitter.com/reisfe/ https://linkedin.com/in/reisfe/