110 likes | 118 Views
W3C Incubator Group on Provenance. Yolanda Gil (chair) Information Sciences Institute And Department of Computer Science University of Southern California gil@isi.edu. Some Context: W3C Incubator Activity (vs Working Group). Fosters rapid development of new Web-related concepts
E N D
W3C Incubator Group on Provenance Yolanda Gil (chair) Information Sciences Institute And Department of Computer Science University of Southern California gil@isi.edu
Some Context:W3C Incubator Activity (vs Working Group) • Fosters rapid development of new Web-related concepts • Lightweight process with W3C support • Exploratory efforts in areas of interest • Get input and pulse from a community • Duration is one year • Open to invited experts in the broader community, not onlyW3C members • May result on follow-on activities, groups, or standardization efforts
What is Provenance • Provenance: Initial sources of information + entities + processes involved in producing a result • Some uses of provenance • Making trust judgments when information sources are diverse and of varying quality (the Web!) • Providing justifications for conclusions • Establishing attribution • Enabling repeatability and reproducibility of processes
The Need for Provenance is Ubiquitous • Business practice • Manufacturing processes and providers of a given product • Cultural artifacts • Origins, owners, processes • Science applications • How new results were obtained: from assumptions to conclusions and everything in between • Licensing and attribution • For a document/software that combines permissions and rights • Web search/use • Making trust judgments on what web content to trust
Immediate Need for Provenance in the Semantic Web Activity • Web of trust • Making trust judgments based on provenance • Reasoners • Attribution of assertions from diverse sources • Linked data • Use of conflicting data of varying degrees of quality • Social trust • Attribution, authority, propagation • Social web • Privacy and use policies of sensitive (personal) data • Life sciences and e-Science at large • Method capture and reproducibility of scientific results
Major Issues in Provenance • What to record • Depends on use • Granularity • Finer grained recording has a cost in performance • Integration of provenance with base assertions • Reification issues • Scale • Provenance information can be much larger than base data/assertions • Verification of provenance information • “oh yeah button” • Presentation to end user • What information and how to make it accessible to users
Some Prior Research • Databases • Aggregations of data, collections, streaming, queries • Knowledge representation and reasoning • Justification and explanation of reasoning • Workflow Systems • Computations leading to new data products • Argumentation • What is taken into account to make a judgment • Information retrieval • Question answering when documents are contradictory/complementary
Relevant Activities at W3C • Enabling technologies • SPARQL Working Group • RDB2RDF Working Group • Web Security Activity • Provide requirements and use cases • E-Government • Semantic Web Health Care and Life Sciences Interest Group • Web Security Activity • Social Web Incubator Group
Goals of the Incubator Group Provide state-of-the-art understanding and develop a roadmap for development and possible standardization • Articulate requirements for accessing and reasoning about provenance information • Develop use cases • Identify issues in provenance that are direct concern to the Semantic Web • Articulate relationships with other aspects of Web architecture • Report on state-of-the-art work on provenance • Report on a roadmap for provenance in the Semantic Web • Identify starting points for provenance representations • Identifying elements of a provenance architecture that would benefit from standardization
Status • September 22, 2009 – September 21, 2009 • Broadening participation • Weekly telecons (started Oct 30, 2009) • Currently developing a timeline for group activities • Starting to gather use cases • Resources: • http://www.w3.org/2005/Incubator/prov/ • http://www.w3.org/2005/Incubator/prov/wiki/
Eric’s pointer in IRC • Scott’s paper • “soft facts” from discourse statements • Confidence measures/uncertainty