190 likes | 394 Views
Welcome to ODaF Europe 2009. International Data Service Center Institute for the Study of Labor Bonn, Germany April 2nd-3rd 2009. Agenda. Day 1 ODaF overview / activities Meeting theme Presentations IZA, ongoing projects, enhanced publications, SDMX Day 2 Discussions
E N D
Welcome toODaF Europe 2009 International Data Service Center Institute for the Study of Labor Bonn, Germany April 2nd-3rd 2009
Agenda • Day 1 • ODaF overview / activities • Meeting theme • Presentations • IZA, ongoing projects, enhanced publications, SDMX • Day 2 • Discussions • Actions / Next Steps
The Open Data Foundation Overview
Why ODaF? • US based non-profit organization established in 2006 • Issues • Data quality, accessibility, discoverability is highly dependent on quality of metadata / documentation • Standards & technology are available, willingness is there • Need coordinated efforts, tools, technology expertise, best practices • ODaF umbrella to bring together stakeholders and technology experts • Complement existing initiatives • Operates at national and international levels (global standards and tools) • Partners: NSO, data archives, RDC, academic/research, international organizations, technology vendors • Overall objective (impact): • Better data to support evidence based policy and project monitoring • Data measures the health of nations, crucial given economical climate and population growth estimates
Mission • Foster the understanding of data and metadata management standards andtechnologies; • Provide technological expertise to agencies in their respective area of activity • Fill the information technology gaps by supporting the development of open sourcetools • Play a central role in networking together individuals and organizations that oftenwork in isolation rather than try to solve common challenges together • Promote the production of open data • Well documented, discoverable, accessible for analysis, respect statistical principles / legislation (privacy)
Activities • Support standards • DDI, SDMX, and others • Best practices • Networking • Virtual, conference participation, event organization, IASSIST sponsor • Projects • NORC Virtual DE, • Tools development (DExT, DDI-TP, SDMX Browser, etc.) • Infrastructure • Open source repository, DDI Tools, DDI Mantis • References • Paper, trainings, presentation
Who is ODaF • Directors • Ernie Boyko – Past President of the International Association for Social Science Information Service and Technology (IASSIST) • Rune Gloersen - Head of Information Technology, Statistics Norway • Robert Glushko, PhD - Member of the OASIS Board of Directors, and the founder and leader of Berkeley's Center for Document Engineering • Advisors • Management Team • Arofan Gregory • Rob Grim • Pascal Heus • Jostein Ryssevik (position open) • 60+ individual members (mailing list) • Membership is at individual level (free) • Sponsored by other member
Who is ODaF • Board o Advisors • Nikos Askitas - Institute for the Study of Labour, Germany • Sandra Cannon - Board of Governors of the Federal Reserve System • Gilles Collette- Visual Communications, Pan-American Health Organization (WHO) • Daniel Gillman - US Bureau of Labor Statistics • Eduardo Gutentag - Member of the the OASIS Board of Directors • Paul Johanis - Statistics Canada (retired) • Graeme Oakley - Australian Bureau of Statistics • Dr. Andrew Nelson - Joint Research Centre of the European Commission • Ken Miller- UK Data Archive / Economic and Social Data Service • Duane Nickull- Chair, OASIS SOA Reference Architecture TC • Juraj Riecan - United Nations Economic Commission for Europe (UNECE) • Gerard Salou - European Central Bank • Professor Bo Sundgren, Ph.D - Statistics Sweden • Wendy Thomas - Minnesota Population Center, University of Minnesota • Wendy Watkins - Data Centre Coordinator, Maps, Data and Government Information Centre, Carleton University Library
ODaF and XML standards • Need a collection of standards • Data Documentation Initiative (DDI) – survey / administrative microdata • Statistical Data and Metadata Exchange standard (SDMX) – aggregated data / time series • ISO/IEC 11179 – concept management and semantic modeling • ISO 19115 – Geographical metadata • METS – packaging/archiving of digital objects • PREMIS – Archival lifecycle metadata • XBRL – business reporting • Dublin Core – citation metadata • Support work on mappings
Recent and ongoing Activities • Projects • NORC Data Enclave • http://www.norc.org/DataEnclave • Virtual enclave (NIST, USDA, Kaufmann, NSF, others) • Completed DDI DExT (UKDA) • Contributed to DDI-FTP • Events • ODaF EU and North America 2008 • IASSIST 08, METIS 2008, FedCAsic 08 & 09, Wiesbaden wokshop (DDI/SDMX), Joint Statistical Meetings (Denver) • Canadian Research Data Center, FedStats, Dagstuhl (training at experts), NORC Metadata workshop • Papers • RatSWD Working Paper 57-9 on Metadata
Planned activities • SDMX Startup guides • SDMX Users forum • SDMX registry • ISO 11179 3d edition registry • Research Web 2.0 technologies • DDI Europe and North America users groups • Virtual Research and Collaborative Center • IASSIST 2009 and other conferences • DDI / SDMX adoption by vendors • Papers • DDI 3, Standard mappings • Small projects • SPSS web based converter, Stata reader, DDI3 web based scheme editors (classifications, concepts, universe, etc), ODaF DeXtris 2.0, contributions to the DDI-FTP
Challenges • Glow into a group o 15-20 active individual in next 3-5 years • Resources • Fill open manager position • Identify new director(s) • Need more activities/contributions from members (sustain activities, mailing list) • Funding • Currently operates on about USD $50K/year + in kind contributions • Need to increase revenues from projects (15% overhead) • Need funding sources • Larger organization / events will require larger budget
ODaF Europe 2009 Metadata in research data centers
ODaF EU 2009 • Topics • Metadata: production, ingestion, archival, digital resources, project, researcher • Enhanced Publications • Standard mappings • Others… • Life cycle metadata • Data food chain: from microdata to aggregated time series / indicators • Need multiple standards • Production, Archival, Research, Resources, Packaging, Project, etc. metadata • Maintain linkages • Across domains • With unstructured knowledge
New focus on Researcher metadata • What is it? • how data is being discovered and used • the research process • the research outputs (enhanced publications) • the new knowledge regarding the data and the research topics • the research profile and behavior • Recently (past 2-3 years) started to get attention from metadata producers/archives • before, they were busy putting together their own metadata) and specification like the DDI 3
How to capture researcher metadata? • Automated: in the background, by metadata aware software applications • Semi-automated: using wizards, quick feedback tools, parsing statistical source code, etc. (requires a small user or administrator intervention) • Manual: where the user itself provides all the information (like documents its data, describes its publication, etc.) • Note that for the later two cases we need (should) to provide incentives for the researcher • research metadata capture is realistic to capture in an RDC environment • it is closed and we control everything that goes in and out • we can observe the user behavior (automated capture) • we can require some information to provided as an RDC access/use condition
Benefits • Meet the replication standard (Gary King,http://gking.harvard.edu/projects/repl.shtml) • Improve quality of outputs (automated production of documentation/citation, better data consistency) • Time saving: data discovery, script generator, document generator, code reuse, citation generator, etc. • Facilitate • preservation of their work • peer review (can be a metadata driven process, simple feedback mechanism can be implemented). • dissemination of their work (well documented enhanced publications can be distributed automatically, registered in citations database, etc.) • use of their work by others (and self) • Foster collaboration • Incentives for the RDC and data providers such as: • feedback from users • understanding of how the data is begin used • copy of research outputs and secondary data r
ODa 2009 Presentations • The IDSC of IZA: Past, Present and Future [Nikos Askitas, IZA] • Concept Hierarchy Assisted News in Economics [Georgios Tassoukis, IZA) • The European Values Study and its metadata: stock taking and the future [Ruud Luijkx, U. Tilburg] • The DRIVER initiative for networking repositories [Wolfram Horstman, DRIVER] • Enhanced publications [Thomas Place, U. Tilburg] • The DatapluS Enhanced Publications Editor [Bart van Nieuwburg, CenERData] • Data Documentation and Dissemination with Questasy [Alerk Amin, CenERData] • Open access and research data [John Doove, SURF] • Overview, importance and benefits of SDMX registries [Arofan Gregory, ODaF] • SDMX at the European Central Bank [Gerard Salou / Xavier Sosnovsky, ECB] • Virtual Research and Collaborative Center [Pascal Heus, ODaF/NORC] • Developments of the Data Infrastructure in Germany since the end of the 90s [Hilmar Schneider, IZA]