1 / 18

Data science and economic statistics

Data science and economic statistics. Louisa Nolan, Senior Data Scientist Alex Noyvirt, Ioannis Tsalamanis, Gareth Clews, Rhydian Page Data Science Campus, Office for National Statistics. 22nd GSS Methodology Symposium 12 July 2017. Government spending. MONIAC Monetary National Income

markel
Download Presentation

Data science and economic statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data science and economic statistics Louisa Nolan, Senior Data Scientist Alex Noyvirt, Ioannis Tsalamanis, Gareth Clews, Rhydian Page Data Science Campus, Office for National Statistics 22nd GSS Methodology Symposium 12 July 2017

  2. Government spending MONIAC Monetary National Income Analogue Computer investment funds foreign- owned balances national income

  3. Using data science to understand the economy 1. Automatic classification of the financial sector using neural networks 3. Can we use admin data as a superfast indicator of GDP growth? 2. Using data from ship tracking to understand trade

  4. 1. Automated classification of the financial sector Total financial asset levels as proportion of nominal GDP, by G7 country and sector

  5. Proposed taxonomy currently published to be published 2017 target classifications

  6. Project scope Data sources Financial Services Survey Inter-Departmental Business Register Companies House Bureau van Dijk industry body lists Financial Conduct Authority Reuters web scraping R&D feature extraction supervised, unsupervised machine learning -> classification: mapping of companies to subsectors unsupervised machine learning -> clustering of groups of similar companies Outputs high speed data linking clusters of companies with similar activity -> useful classification? granular financial statistics for the enhanced financial accounts

  7. Half-time score • Fuzzy dataset linking • highly optimised algorithm (Spark, SCALA) • ~150 million combinations in 2 hours • Sector classification from name alone • 15 - 18% accuracy (19 SIC groups) • Sector modelling part using the validation dataset, FSS • K-nearest neighbours (K-NN) clustering • ~60% accuracy – work in progress • Next steps: • neural networks • ensemble approach – combine several weak indicators • more data… share price movements, annual accounts

  8. 2. Tracking ships to understand trade Can we use shipping as an early indicator for GDP? Can we better understand traffic at British ports?

  9. Automatic Identification System

  10. 3. Superfast indicators of GDP growth How early can we identify negative GDP growth?

  11. Superfast indicators of GDP growth

  12. Superfast indicators of GDP growth VAT turnover returns

  13. Superfast GDP indicator from VAT turnover • start simple • compare the quarter with same quarter a year ago, to minimise seasonality • index = number of companies where [(Tt0 – Tt-4] > 0 total number of companies in sample • no deflation (yet) • no outliering (yet) • no bias adjustment (yet) • no seasonal adjustment (yet) • where we have a TO value for both q0 and q-4 • test for month 1, 2 and 3 returns is this a useful indicator of the direction and broad magnitude of GDP growth?

  14. Superfast GDP indicator – results

  15. Superfast GDP indicator - results 2008 quarter 4 GDP growth = -1.3% 2013 quarter 2 GDP growth = 4.3%

  16. Superfast GDP indicator - results

  17. What have we learned so far? • data science can enhance our understanding the economy • an experimental approach allows rapid prototyping • collaboration with subject matter experts is important • we need to think about implementation early in the project lifecycle • (work can be fun)

  18. contact us web: www.ons.gov.uk/datasciencecampus email: datasciencecampus@ons.gov.uk Twitter: @DataSciCampus

More Related