160 likes | 245 Views
Graphical Models and Probabilistic Reasoning for Generating Linked Data from Tables. Varish Mulwad ( @ varish ) University of Maryland, Baltimore County Doctoral Consortium at ISWC 2011 October 24, 2011. Guru: Dr. Tim Finin. What ?. Contribution.
E N D
Graphical Models and Probabilistic Reasoning for Generating Linked Data from Tables VarishMulwad (@varish)University of Maryland, Baltimore CountyDoctoral Consortium at ISWC 2011October 24, 2011 Guru: Dr. Tim Finin
Contribution http://dbpedia.org/class/yago/NationalBasketballAssociationTeams dbprop:team http://dbpedia.org/resource/Allen_Iverson Map literals as values of properties
Contribution @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix dbpedia-owl: <http://dbpedia.org/ontology/> . @prefix yago: <http://dbpedia.org/class/yago/> . "Name"@en is rdfs:label of dbpedia-owl:BasketballPlayer . "Team"@en is rdfs:label of yago:NationalBasketballAssociationTeams . "Michael Jordan"@en is rdfs:label of dbpedia:Michael Jordan . dbpedia:Michael Jordan a dbpedia-owl:BasketballPlayer . "Chicago Bulls"@en is rdfs:label of dbpedia:Chicago Bulls . dbpedia:Chicago Bulls a yago:NationalBasketballAssociationTeams . All this in a completely automated way !!
Tables are everywhere !! … yet … The web – 154 millionhigh quality relational tables [1] 389, 697 raw and geospatial datasets0.071 % in RDF
Current Systems Problems with systems on the Semantic Web • Require users to have knowledge of the Semantic Web • Do not automatically link to existing classes and entities on the Semantic Web / Linked Data cloud • RDF data in some cases is as useless as raw data • Majority of the work focused on relational data where schema is available
A Table Interpretation Framework Linked Data Probabilistic Graphical Model / Joint Inference
Joint Inference over evidence in a table • Probabilistic Graphical Models
A graphical model for tables Class C2 C3 C1 R21 R31 R11 R12 R22 R32 R13 R23 R33 Instance
Parameterized graphical model Captures interaction between row values R33 R11 R12 R13 R21 R22 R23 R31 R32 Row value Factor Node C2 C1 C3 Function that captures the affinity between the column headers and row values Variable Node: Column header Captures interaction between column headers
Evaluation • Dataset of > 6000 tables [2] • Compare our accuracy against our baseline system and the results in [2] • Use Mean Average Precision [3] to compare a ‘ranked list of possible classes/entities’ against a ranked list obtained from human evaluators • Experiment with datasets from www.data.gov
References • Cafarella, M. J., Halevy, A., Wang, D. Z., Wu, E., Zhang, Y., 2008. Webtables: exploring the power of tables on the web. Proc. VLDB Endow.1 (1), 538-549. • Limaye, G., Sarawagi, S., Chakrabarti, S.: Annotating and searching web tables using entities, types and relationships. In: Proc. 36th Int. Conf. on Very Large Databases (2010) • Salton, G., Mcgill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York (1986)
Thank You ! Questions ? varish1@cs.umbc.edu@varish Web:http://ebiq.org/h/Varish/Mulwad