300 likes | 466 Views
Structural Induction, towards Automatic Ontology Elicitation. Adrian Silvescu. Ontology Elicitation . How to get computers to figure out what an application domain (the world) is all about?
E N D
Structural Induction, towards Automatic Ontology Elicitation Adrian Silvescu
Ontology Elicitation • How to get computers to figure out what an application domain (the world) is all about? • How to get computers to derive automatically the set of relevant concepts, entities and relations, pertaining to a certain application domain (or the world)
Financial News • NEW YORK (AP) -- Shares of Media giant Time Warner Inc. were little changed Monday ahead of the company's third-quarter earnings report as investors wonder exactly what chairman Dick Parsons might say about its troubled America Online unit. There was heavy speculation this week the company -- which once had the "AOL" ticker symbol and called itself AOL Time Warner -- might announce job cuts at the online giant. A number of news organizations reported Tuesday that AOL is expected to cut about 700 jobs by the end of the year. There has also been continued speculation Time Warner might even spin off the unit
Main Paradigm • Abstraction, on the other hand, establishes that a set of entities belong to the same category, based on a certain notion of “similarity” among these entities, and subsequently names that category. • Super-Structuring is the process of grouping and subsequently naming a set of entities that occur within "proximity" of each other, into a more complex structural unit.
Structuralism vs. Functionalism • Structuralism: Meaning (and “similarity” in meaning thereof) is defined in terms of structural features. [e.g., in order to define a chair we are going to say that it has four legs, a rectangular surface on top of them and a back • Functionalism: The meaning of an object resides with how it is used. [e.g., a chair is something that I can sit on, and if you intentionally kill somebody with that chair, then the chair is for all practical purposes a weapon. • Means vs. Ends
Inspirations for the main paradigm • OOP / Software Engineering • Composition = Super-Structuring • Inheritance = Abstraction • Philosophy of linguistics – Words are similar in meaning if they are used in similar contexts - “distributional semantics” hypothesis [Zellig Harris].
Abstractions and SuperStructures • SuperStructures – Proximal entities should be SuperStructured into a higher level unit if they occur together in the data significantly above chance levels. • Abstractions – Entities are similar and should be abstracted together if they occur within similar contexts
Example Step1 data: Mary loves John. Sue loves Curt. Mary hates Curt. Abstractions 1: A1 -> Mary | Sue because they have similar right contexts: loves. A2 -> John | Curt because they have similar left contexts: loves. Step 2 data: [Mary, A1] loves [John, A2]. [Sue, A1] loves [Curt, A2]. [Mary, A1] hates [Curt, A2]. Abstractions 2: A3 -> loves | hates because of high similarity between their left and right contexts: This illustrates how abstraction begets more abstraction (A3 not possible on the raw data). Step 3 data: [Mary, A1] [loves, A3] [John, A2]. [Sue, A1] [loves, A3] [Curt, A2]. [Mary, A1] [hates, A3][Curt, A2]. Structures 3: S1 -> A1 A3 because it occurs three times S2 -> A3 A2 because it occurs three times This illustrates how abstraction begets structuring (S1 and S2 not possible on the raw data) Structures 4: S3 -> S1 A2 S4 -> A1 S2
Induction • We go around the world and we interact with it • In the stream of experiences we notice regularities which we call patterns • Sometimes we lump these patterns/laws into more encompassing theories • We call knowledge the set of all patterns that we are aware of at a certain point in time.
Automatic Induction • How can we make artificial machines that are capable of induction? • We need to say • What do we mean by knowledge? – KRP • How to derive it form data? – LP
Outline • What do we mean by Knowledge • ASNF • How to derive it from data • ASEXP
Abstraction SuperStructuing Normal Forms On: What do we mean by knowledge?
Computationalism • Computationalism: The most general way to represent theories are Turing equivalent formalisms
Induction by Enumeration[Solomonoff’65] • Induction = (Exp_stream,TM_equiv, E) • for all Turing Machines • simulate their computations by dovetailing • If TM produces the Exp_stream* • update min and argmin if needed • More General if current TM strikes a good balance between size and error
Problem with enumeration • Does not share the computation of intermediate results that may be common among different TM • Generate and Test – not Data Driven • We can invent arbitrary contraptions which may have nothing to do with the data
Beyond Enumeration • How can we fix these problems? • Computation & Data Structures Sharing • Data Driven • Still Computationalism (CT) but more efficiently implemented
Motivation • There are many (infinite) types of rules α → β that we can invent – can we get some atoms • We search for the fundamental operations based on which theories are constructed
What are the fundamental operators? • Fundamental Operators • Abstraction (grouping “similar” entities) • Super-Structuring (grouping topologically close entities) • Example • Abs: (cow, pig, monkey) -> mammal • SS: (chassis, wheels) -> car
CFG -ASNF • Any CFG = (N,T,S,R) can be written in the form • A B - and there are only two rules with A on lhs • A BC – and this is the only rule that has A on the lhs • A a – and this is the only rule that has A on the lhs • Additionally if ε is in L(G) we have S ε and this is the only production that has S on the lhs
GG -ASNF • Any Grammar can be written in the form • A B - and there are only two rules with A on lhs • AB C - and this is the only rule that has A on the lhs • A BC – and this is the only rule that has A on the lhs • A a – and this is the only rule that has A on the lhs • E ε – and this is the only rule that has ε on the rhs
Notations • We call • A B1, … ,A Bn – Abstraction (ABS) • A BC – SuperStructure (SS) • AB C – Reverse SuperStructure (RSS) • A1 B, … ,An B – Reverse Abstraction • A a ,E ε – Convenience Notations
CSG -ASNF • Any CSG can be written in the form • αAβ αBβ - and there are only two rules which rewrite A • A BC – and this is the only rule that has A on the lhs • A a – and this is the only rule that has A on the lhs • Additionally if ε is in L(G) we have S ε and this is the only production that has S on the lhs
Radical Positivism & Hidden Vars. • Radical Positivism • Every contraption “directly” traceable to the data • Hidden Variable are eliminates – only indirect connection • Need probabilities to rule out • “In heaven if it rains do the angels get wet or not • God’s theory of the universe • Conj: One Grow and One Reduce are enough
Two types of hidden variable • Mixture models - RABS • H1 A, H2 A • Either cause can produce the effect • Co - occurring causes – RSS • H1H2 A • Both cause need to be present and also respect the topological constraints • For complete topology it is just an AND
REG -ASNF • Any REG can be written in the form • A B - and there are only two rules with A on lhs • A aB – and this is the only rule that has A on the lhs • A a – and this is the only rule that has A on the lhs • Additionally if ε is in L(G) we have S ε and this is the only production that has S on the lhs
Points • Nonterminal –Internal Variables • ABS + SS + RABS + RSS + NOT • God’s letter • Logical Positivism • Quantitative Overlay • God’s letter • ABS – the only choice/uncertainty • Transductive Setup • Actions – Causes – (Hume’s last)
Induction of AS models • Assume only Abstraction and SuperStructuring • No recursion
Algorithm [Silvescu and Honavar, 2003] until a limit criteria has been reached top_ka_abstractions = Abstract(sequence_data) top_ks_structures = SuperStructure(sequence_data) new_sequence_data = Annotate sequence_data with the new abstractions and structures repeat SuperStructure(S->AB) - returns the topmost ks structures made out of two components according to an (independence) measure of whether A and B occur together by chance (e.g., KL(P(AB)||P(A)P(B) ) Abstraction(S-> A | B) - returns the topmost ka abstractions (clusters) of two entities ranked according to the similarity between the probability distributions of their left and right contexts (e.g., JensenShannon(context(A),context(B)))