160 likes | 170 Views
Data Intergration at the South African Data Archive (SADA) NRF. Presented by Dr Daisy Selematsela 1 st African Digital Curation Conference 12 – 13 Feb 2008 CSIR Pretoria. Outline. What is SADA Stakeholders Products and services Challenges Value adding of data sets. What is SADA?.
E N D
Data Intergration at the South African Data Archive (SADA) NRF Presented by Dr Daisy Selematsela 1st African Digital Curation Conference 12 – 13 Feb 2008 CSIR Pretoria
Outline • What is SADA • Stakeholders • Products and services • Challenges • Value adding of data sets
What is SADA? • Established in 1993 • Oriented towards Social Sciences • Brokerage service between data providers • Provide central repository for quantitative data
SADA Stakeholders • Statistics South Africa • Human Science Research Council (HSRC) • IDASA • South African Police Services • International Association of Social Science Information Service & Technology (IASSIST) – Europe • Inter-University Consortium for Social & Political Research (Univ. of Michigan)
Who & where are SADA clients? • North America (432) • Canada (10) • Brazil (3) • Europe (206) • Australia (36) • New Zealand (4) • Mauritius (2) • Asia (61) • Africa (West & East) (26) • South Africa (400)
Products & Services: Datasets • Census surveys • General Household surveys • Demographic studies • Health studies • Substance abuse and crime • Income and poverty • Inter-group relations • Labour (workforce survey) • Political perceptions and attitudes • Education & training • Omnibus and international studies
ICPSR Membership based org. National subscription Access to huge collection of Social Science data Benefit SA researchers, institutions, postgrads etc Directory of Data Producers in SA Produced by NRF Aimed at organisations that produce scientific data One entry point Online entry form for interested parties Services: Brokerage
Challenges • Data intergration • Def: ways in which information from a variety of sources (census, survey, transactions, administrative systems) usually held in different Db can be combined to create powerful new resource to address major research issues! • Constraints: • Lack of knowledge about the scope of integration • Lack of skills to facilitate linking and the awareness that, as data become more extensive, • The possibility of inadvertent disclosure of the identity of individuals/organisations increases
Challenges • Data stewardship/management • Errors and biases in data representation • Lack of interoperability • Data discovery • Loss of scientific and transformative power due to lack of knowledge about existing data opportunities • Inadequate tools to find existing data • Cost of documenting and storing data
Challenges • Promoting best practice in Data sharing (OECD Guidelines 2007) assumption that publicly funded research data “are a public good, produced in the public interest” • ‘Data re-use’ or ‘secondary data analysis’ • Legal impediments to use data for purposes other than those for which their collection was originally authorised; • Ethical considerations – need to inform data subjects & seek permission to reuse data • Scientific culture associated with “first and privileged use” of data collected for a specific purpose
Helsinki School of Economics study • “Sensation Seeking , Overconfidence & Trading Activity” accepted by The Journal of Finance, Mark Grinblatt (Univ. of California) & Matti Keloharju (Helsinki School of Economics) • Version available at http://www.anderson.ucla.edu/documents/areas/fac/finance/06-06.pdf • Source: International Herald Tribune, Saturday-Sunday, February 9-10,2008: Speeding and trading: It’s the same heady rush (page 15)
Speeding study If you get speeding tickets, watch out: the chances are good that you will also engage in possibly dangerous investing behaviour, too.
Study Data use! • Mark & Matti were able to find a correlation between speeding tickets & trading frequency after they received access to several data sets from Finnish government • Databases contained details: • of speeding tickets issued between mid-1997 – 2001 of Helsinki residents • Portfolios and trading records of all Finish households from 1995 – 2002 • Filing of tax returns • Outcome: these rich data sets enabled the researchers to bracket other possible causes of trading activity and focus on the distinct influence of speeding tickets alone!