1 / 21

Torben Tranæs R OCKWOOL F ONDENS F ORSKNINGSENHED

Copenhagen, August 12, 2010. The need for breadth and depth Compatibility between Demand and Supply for research data. Torben Tranæs R OCKWOOL F ONDENS F ORSKNINGSENHED. Introduction. What is the demand for data when you do research; broad applied as well as academic?

Download Presentation

Torben Tranæs R OCKWOOL F ONDENS F ORSKNINGSENHED

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Copenhagen, August 12, 2010 The need for breadth and depth Compatibility between Demand and Supply for research data Torben Tranæs ROCKWOOL FONDENS FORSKNINGSENHED

  2. Introduction • What is the demand for data when you do research; broad applied as well as academic? • And, how do these demands match up with the way national statistical agencies are operating? • Two main issues/problems: 1. the ways of data access. 2. the existing variables

  3. Preconditions • The main operation run by a national statistical agency is not about research – it is about description and documentation • And also, the way data access has been considered – at least in Denmark – has mainly accommodated other users than academic researchers • My main focus will be the independent academic research

  4. My talk: • The research process and data access • The need (demand) for and supply of data, and mismatch problem • Case: crime and the labor market • Conclusion

  5. A generic research process: • The researcher formulates a hypothesis • A data set with the relevant variables is acquired • Hypothesis is tested • … and e.g., rejected! • Inspired by data, the theories are being revised/new or revised hypotheses are being formulated/the controlling environment is reconsidered • Data is arranged with (some) new variables • The new/revised hypothesis is being tested, etc.

  6. This way of working would be very cumbersome with the way DsT use to think of data access • According to that model a data set was made available for a given project only, and for a fixed period – all of it being decided up front • Fortunately, the practice was different

  7. Demand and Supply for Data; and possible mismatch • The demand (from the research community): • Variables defined based on social science theory • High precision, conceptually • Panel data with: 1) a long time dimension, and 2) a big big number of relevant and irrelevant variables, to get exogenous variation and instruments, e.g., by constructing discontinuities (if they don’t exist naturally) • The data supply • Administratively defined variables • High precision in terms of low measurement errors • Somewhat short but ever increasing time dimension • An extremely rich set of information (many variables)

  8. Mismatch problems: • Long run: The set of existing variables does not co-inside with the set of warranted variables • Short run: The set of existing relevant data does not co-inside with the set of accessible data

  9. Case: Does unemployment increase after a prison sentence? • What is the key variable? • Is it • ‘unemployment’, or • the fraction of people without a job? • In Denmark we have full-population information on the latter, not the former. We know who receive unemployment benefits, but that’s not necessarily the same as being ‘unemployed’ • Being ‘unemployed’ means that you are employable, available and active searching for jobs

  10. Thus, three different measures of difficulties at the labor market: • People without job • Registered unemployment • (Employable) Jobless individuals who search and are available for employment • There exists information on 1. and 2., but on 3. only for small sub-samples of the population – and that is not enough when studying crime and former inmates

  11. As we shall see below; it makes a big difference what definition is used

  12. Unemployment before and after prison

  13. Unemployment before and after prison

  14. Unemploymentbefore and after prison

  15. Unemployment before and after prison

  16. Unemployment before and after prison

  17. Fraction of ex-convicted that are on public support after prison relative to fraction in population age 15-59

  18. Wage earnings before and after having served a long prison sentenceEarnings relative to unskilled, same age, same year

  19. Summary of the case: • Dramatic different conclusions depending on which measure of unemployment is used • The richness of data reveals new stylized facts, e.g.: • the deroute begins before crime • That implies extra research questions in order to answer the question: • What is it that triggers the deroute? • School/labor market problems or family problems, substance abuse, beginning mental illness, etc. • But these questions cannot be pursued right away given the (existing) DSt policy.

  20. Conclusion Two main problems/challenges: • A policy for data access that is not very compatible with the research process • Mismatch between the needed variables and the existing variables

  21. What has been fixed lately or is being fixed as we speak? • In the near future it will be possible to operate big multi-project data sets with practical no ending date • What is not being resolved - remaining problems from the researcher’s point of view • In some major areas: Either the variables are somewhat wrong compared to the hypothesis in question, or • they are very expensive and still only possible to get in small samples

More Related