170 likes | 296 Views
A Logical Formulation of PRMs. Example. Institute(InstId,Type) Researcher(RID,Area,Salary,InstID) Paper(PaperId,Topic) Author(RId,PaperID) Cites(PaperId1,PaperId2). Types of Uncertainty. intra-relational dependency a researcher’s salary depends on their research area
E N D
Example • Institute(InstId,Type) • Researcher(RID,Area,Salary,InstID) • Paper(PaperId,Topic) • Author(RId,PaperID) • Cites(PaperId1,PaperId2)
Types of Uncertainty • intra-relational dependency • a researcher’s salary depends on their research area • inter-relational dependency • a researcher’s salary depends on the type of institute they work at • reference uncertainty • a paper’s author is more likely to be a research in the same area as the paper • exists uncertainty • a citation between two papers is more likely to exist if they are on the same topic • identity uncertainty • the authors of two distinct papers are more likely to be the same individual if the author names are similar and if the co-authors are the same
DependsOn Predicates • Examples: • DependsOn1(Salary; Area) Researcher(RId,Salary,Area,InstId) • DependsOn1(Salary; Area, Type) Researcher(RId,Salary,Area,InstId), Institute(InstId,Type)
Rules for DependsOn • The set of DependsOn predicates occur only in the heads of clauses • The body of a DependsOn clause may contain extensional predicates, built-in predicates • Every descriptive attribute A must appear as the first argument of a DependsOn predicate • If there is more than one DependsOn predicate for a particular attribute, require for each corresponding key, only one DependsOn matches.
Aggregates • A researcher’s salary depends on the number of publications they have: • CountRIDAuthor;PaperId(RId,CntPapers) • this takes the Author relation, Author(RId,PaperId) groups by RId and takes the count • Equivalent to • select RId,count(PaperId) as CntPapers from Author group by RId • More general form: • Aggrkeypredicate;Aggr-Variable-List(Key,AggrVal)
Syntax • Predicates – ordinary predicates, aggregates, DependsOn • Clauses – Key Constraints, DependsOn Clauses • CPDs
Semantics • Attribute Uncertainty • Background theory provides instantiations for both the primary key and foreign keys through a set of partially instantiated extensional predicates
Researcher-Inst(101) Researcher-Inst(102) Researcher-Institute-Inst(101,201) Researcher-Institute-Inst(102,201) Institute-Inst(201) Paper-Inst(301) Author-Inst(101,301) Author-Inst(102,301) Paper-Inst(302) Author-Inst(101,302) Cites-Inst(301-302) A Sample KB
Intensional Predicates to Introduce RVs • Area, Salary Researcher(RId,Area,Salary,InstId) Researcher-Institute-Inst(RId,InstId) • Type Institute(InstId,Type) Institute-Inst(InstId)
Dependency Graph • Convert each numbered DependsOn statement to a general binary relation • DependsOn(Ai;….,A,,…) … • Let Vi and V, be instantiations • we add Vi < V, • We require < to be acyclic
Reference Uncertainty • Paper(PaperId,Topic), Venue(VenueId,Area), PublishedIn(PaperId,VenueID) • Paper-Inst(301), Paper-Inst(302), Venue-Inst(stoc), Venue-Inst(focs), Venue-Inst(icse), Venue-Inst(pldi), Venue-Inst(isca) • Venue PublishedIn(301,Venue) • Venue PublishedIn(302,Venue) • VenueKeys = { VenueId | VenueId Venue-Inst(VenueId)} • VenueKeys = {stoc,focs,icse,pldi,isca}
FKDependsOn • FKDependsOn(VenueId;Area;Topic) PublishedIn(PaperId,VenueId), Paper(PaperId,Topic),Venue(VenueId,Area) • General form: • FKDependsOn(variable;<partition-variable-list>;<parents>) • Once a partition is chosen for a variable, the key is chosen uniformly from that partition.
Ensuring Coherence • We require that the parents and the variables that define the partition come before the fk variable in the dependency graph • Also, any dependencies based on the fk must occur after the fk is determined.
Existence Uncertainty • CiteExists(PaperId1,PaperId2,Exists) • Cites(PaperId1,PaperId2) CiteExists(PaperId1,PaperId2,True) • DependsOn(Exists;Topic1,Topic2) CiteExists(PaperId1,PaperId2,Exists), Paper(PaperId1,Topic1), Paper(PaperId2,Topic2)