140 likes | 160 Views
Learn how technology can help publish, search for, and manage statistical data while ensuring confidentiality and optimal access. Explore standards, tools, and challenges at the IASSIST 2007 session.
E N D
IASSIST 2007Montreal, May 16 - 18, 2007Session A2Open Data and the Common GoodWalking the Wire: How Technology helps us Achieve the Correct Balance Arofan Gregory Jostein Ryssevik Open Data Foundation jryssevik@opendatafoundation.org agregory@opendatafoundation.org http://www.opendatafoundation.org
What if... • …statistical data could be published as easily on the Web as documents and pictures? • …users of statistical data could search for relevant sources across the Web more or less in the same way as they are “googling” for relevant documents? • …users of statistical data had a “give me more data like this” function at hand allowing them to locate data from disparate sources in order to create a time-series or do a comparison • …published statistical data could be described in such a way that human users as well as software clients would know exactly what they meant and how they could be used • …statistical data could travel across the wire from one system to another, without the loss of information or the need for time-consuming and error-prone transformations • …confidentiality could be managed in such a way that an optimal balance between access and data protection could be found Open Data Foundation – IASSIST 2007
The basic drivers behind the story Tools Standards Open Data Foundation – IASSIST 2007
No tools – no take-up • No standard is better than the set of tools that supports it • Without tools that can make it easy to produce, publish and exchange data an metadata according to standards, no standard will ever fly • If the cost of producing and maintaining standardized metadata outweighs the benefits arising from reduced transactions costs, it will never be done Open Data Foundation – IASSIST 2007
No standards – no synergy • Without relevant standards any technology is parochial in nature • Standards lower the costs of entering the technology game, promotes development and competition • Standards remove big chunks of technology from the competitive part of the game and allow vendors and developers to focus on new and additional functionality. Open Data Foundation – IASSIST 2007
Status on the standards arena • DDI – well known to this community • SDMX • ISO11179 • Triple-S • ...... Open Data Foundation – IASSIST 2007
Status on the tools arena • Metadata authoring/compilation • Nesstar Publisher, IHSN Microdata Management Toolkit, others • Data production • no data production system (like Blaise, Cases, Confirmit) is using or creating DDI metadata • Desktop data analys • No standard statistical package (like SPSS, SAS, STATA) is supporting DDI (let alone supporting metadata in any shape or form) • Data publishing/on-line access and analysis • VDC, SDA, Nesstar, DAIS • Metadata/data registries • SDMX registry (Metadata Technology/Open Data Foundation). • Access control/statistical disclosure control • no standard based tools Open Data Foundation – IASSIST 2007
Still - we are gradually getting there 3.0 • Tools capturing, using and producing metadata at all crucial stages of the data production process. • Life-cycle oriented metadata standards supporting this technology landscape • Increased interperability among other more specialised standards plugging in at various stage of this process. Open Data Foundation – IASSIST 2007
Too Much Openness? • In our probable future, technology offers huge benefits in terms of Open Data • Greater access to data throughout the lifecycle and the information chain • An understanding of the inter-relatedness of all aspects of our data • Although we understand the requirements for data confidentiality, there is a huge risk • Technology magnifies the potential for abuse • For technologists, this is a technology challenge • What tools do we build? • How do we use them? Open Data Foundation – IASSIST 2007
Dealing with Technology Challenges • There is a little-known solution to many technology challenges: • “Remember: we’re dealing with chimps here.” (Elliot Kimber, 1995) • Why chimpanzees? • Their DNA is 98.6% identical to humans • They are curious, social animals • They use tools • They’re even pretty good with computers! Open Data Foundation – IASSIST 2007
Not Monkey Business • In this case, chimps don’t have the answer – we have to consider how humans are different from chimps • Humans learn from shared, communal experience • Humans develop and use tools in a progressive way • We have the technology to destroy all life on the planet! • Humans are the only higher primates which mate in private • We ignore our need for privacy at our peril! Open Data Foundation – IASSIST 2007
Focusing our Efforts • Because we’re chimps, we will build tools • We will develop the technology to support Open Data with increased access and inter-connectedness • We will build tools to support the common, public good • Because we’re humans, we need to focus our efforts • The key to this technology challenge is metadata Open Data Foundation – IASSIST 2007
Factoid • Estimates say that a large percentage of the questions researchers have which require access to microdata can be answered through a combination of good metadata and summary statistics • By emphasizing these things, we can avoid the risk of violating confidentiality Open Data Foundation – IASSIST 2007
Summary • We should – and will – create systems which support the ideas of Open Data • That’s what chimps do! • Our technology must focus on an understanding of our data – the metadata – rather than simply providing increased access • Technologists must exercise a kind of restraint and wisdom that is not very common, to respect the privacy of individuals • That’s what makes us human Open Data Foundation – IASSIST 2007