180 likes | 301 Views
The Data Deluge: What Does It Mean for Official Statistics?. Steven Vale UNECE steven.vale@unece.org. Contents. What is the Data Deluge? Response of the official statistics community Industrialisation – a related challenge Changing roles for national statistical organisations.
E N D
The Data Deluge:What Does It Mean for Official Statistics? Steven Vale UNECE steven.vale@unece.org
Contents • What is the Data Deluge? • Response of the official statistics community • Industrialisation – a related challenge • Changing roles for national statistical organisations
The internet has 1800 exabytes of data in 2011 exa = 10^18 What is the Data Deluge?
We live in exponential times! 50,000 exabytes by 2020 27 fold growth in the next 9 years
Are these data interesting? • Probably 99.9% are videos, photos, audio files, text messages and other nonsense • But that still leaves1,800,000,000,000,000,000bytes of potentially relevant data
An Observation More and more people post information about themselves on Facebook, but less and less complete statistical surveys Should we create STATVILLE?
Private sector competitors? • Google: • Real-time price indices • Public Data Explorer • First point of reference for the “data generation” • Facebook, store cards, credit agencies, ... • What if they link their data? • The private sector now understands the value of data: Can they beat us at our own game?
Official Statistics Response • High-Level Group for Strategic Directions in Business Architecture in Statistics • UNECE group, created by the Conference of European Statisticians in 2010 • 8 heads of national and international statistical organisations • Develop and promote new: Sources Processes Products
HLG-BAS Strategic Vision • Endorsed by the Conference of European Statisticians on 14 June We have to re-invent our products and processes and adapt to a changed world
The Challenges are too big for statistical organisations to tackle on their own.We need to work together
What does this mean in practice? • Collaboration • Coordination • Communication
Many international groups and projects are talking about streamlining and industrialising statistics
Industrialisation is: • Common processes • Common tools • Common methodologies • Recognising that all statistics are produced in a similar way: No domain is “special” • Increased flexibility to adapt to new sources and produce new outputs
Changing roles in NSOs? • One source = one output • Data integration – multiple sources • Process quality assurance • More focus on analysis and interpretation • Partnerships for dissemination • Changing staff and cost profiles • Changing organisational culture
Wider Definition of Admin Data? • New sources are “non-statistical” • But - similar issues to “traditional” administrative data sources Whatever we call the new sources,we can’t ignore them!
Questions?steven.vale@unece.orgwww1.unece.org/stat/platform/display/hlgbasQuestions?steven.vale@unece.orgwww1.unece.org/stat/platform/display/hlgbas