430 likes | 593 Views
Structural Induction: towards Automatic Ontology Elicitation. Adrian Silvescu. Induction. We go around the world and we interact with it In the stream of experiences we notice regularities which we call patterns/laws Sometimes we lump these patterns/laws into more encompassing theories
E N D
Structural Induction: towards Automatic Ontology Elicitation Adrian Silvescu
Induction • We go around the world and we interact with it • In the stream of experiences we notice regularities which we call patterns/laws • Sometimes we lump these patterns/laws into more encompassing theories • We call knowledge the set of all patterns that we are aware of at a certain point in time.
Automatic Induction • How can we make artificial machines that are capable of induction? • We need to say • What do we mean by knowledge? – KRP • How to derive it form data? – LP
Outline • Introduction • What do we mean by Knowledge? • Abstraction SuperStructuring Normal Form • How to derive it from data? • Combining Abstraction + SuperStructuring • The other chapters from the thesis • Conclusions and Contributions
Computationalism • Computationalistic Assumption (CA): The most general way to represent (finite) theories is as an entity expressed in a Turing equivalent formalism. • a.k.a. Church-Turing thesis
Induction by Enumeration [Solomonoff’64] • Induction : Exp_streams → TM • Out: “smallest” TM which reproduces the data (Exp_stream) • for all Turing Machines • simulate their computations by dovetailing • If TM produces the Exp_stream • update “smallest” TM if needed
Theory1 Data EE vs. DP - Compositionality Induction Induction Theory2 … Theory2 Theoryn Theoryn Theory1 … … Data Data Data
Generative grammars • TM equivalent • G=(N,T,S,R) – Theory • N – NonTerminals – Internal Variables • T – Terminals – Observables • R – Rules {α → β} α,βє (NUT)*– Laws • w єT*–Observations Stream • Derivation S →*w, w єT* - Explanation
Example S • S → A|B • A → CD • B → EF • F → GH • EG → J • C|H → K • J → a • K → b B E F G H J K a b
Motivation • There are many (infinite) rules α → β that we can invent – how can we get a finite set of atoms? • We search for a fundamental set of operations based on which theories can be constructed
Fundamental Operations • Abstraction - grouping similar entities under one overarching category. • e.g., (cow, pig, monkey) → mammal • Super-Structuring - grouping into a unit topologically close entities - in particular spatio-temporally close • e.g., (chassis, [on top of] wheels) → car
Main Theorem: GEN -ASNF • Abstraction SuperStructuring Normal Form: • Any Grammar G=(N,T,S,R) can be rewritten using only rules of the form: • A → B – Renaming (REN) • A → BC – SuperStructure (SS) • A → a – TERMINAL • AB → C – Reverse SS (RSS) • 2-4 can be made strongly unique
Renamings and Abstractions • Renamings form a directed graph G=(N, {(A,B) єREN}) • A → B1, … ,A → Bn – Abstraction (ABS) • A1→ B, … ,An→ B – Reverse Abstraction (RABS) • S → A|B • A → CD • B → EF • F → GH • EG → J • C|H → K • J → a • K → b C H S A B K
Fundamental Operations • A → B|C – Abstraction (ABS) • A → BC – SuperStructure (SS) • A → a – TERMINAL • A|B → C – Reverse ABS (RABS) • AB → C – Reverse SS (RSS)
Example S • S → A|B • A → CD • B → EF • F → GH • EG → J • C|H → K • J → a • K → b B E F G H J K a b
Two types of hidden variables • Mixture models - RABS • H1 → A, H2 → A • Either cause can produce the effect • Co - occurring causes – RSS • H1H2 → A • Both causes need to be present and also respect the topological constraints • For complete topology this is just AND
Radical Positivism • Empirical laws only • Every contraption directly traceable to Observables • Hidden Variable are eliminated – only indirect connection • No RABS or RSS w ABS+SS S
God’s e-mail w RABS+RSS REN+SS Conjecture: ABS + SS S
Hume’s Claim • “I do not find that any philosopher has attempted to enumerate or class all the principles of association [of ideas]. ... To me, there appear to be only three principles of connexion among ideas, namely, Resemblance, Contiguity in time or place, and Cause and Effect” – David Hume, Enquiry concerning Human Understanding, III(19), 1748. • Resemblance – Abstraction (ABS) • Contiguity – SuperStructuring (SS) • Cause & Effect – RABS + RSS
Theory Review • Abstraction + SuperStructuring Thesis • ABS, SS, RABS, RSS enough for TM eq. • Rationales for Hidden Variable • RSS and RABS • Radical Positivism (ABS + SS) • Proof of Hume’s claim (under CA)
Outline • Introduction • What do we mean by Knowledge? • Abstraction SuperStructuring Normal Form • How to derive it from data? • Combining Abstraction + SuperStructuring • The other chapters from the thesis • Conclusions and Contributions
Induction of ABS+SS models • Abstraction and SuperStructuring only • No recursion • Radical Positivist Setup (w/o recursion) • Sequence classification setup • Superstructures are k-grams
Sequence Classification with feature construction Class Classifier (Naïve Bayes Multinomial) Construct Features S1 S2 S3 S4 S5 S6
Feature Construction – ABS + SS Class Classifier A1:{S1S2,S3S4} A2:{S2S3} A3:{S4S5,S5S6} S1 S2 S2 S3 S3 S4 S4 S5 S5 S6 S1 S2 S3 S4 S5 S6
Learning Abstractions All {k1,k2,k3,k4,k5,k6} {k2,k3,k4,k5,k6} Most similar! {k2,k3,k4} {k7,k8,k9} {k2,k3} {k5,k6} {k7,k8} {k1} {k2} {k3} {k4} {k5} {k6} {k7} {k8} {k9}
Similarity Distance • Distance between P(C|f1) and P(C|f2) where f1 appears n1 times and f2 appears n2 times in the dataset
Data Sets • Protein sequence classification based on their sub-cellular localization • 2 datasets: • Eukaryotes (2427 sequences) – 4 classes • Prokaryotes (997 sequences) – 3 classes • Average seq. length ~300 aminoacids • unigrams ~20, 2-grams ~400, 3-grams ~8000
Experimental setup • UNIGRAM – Base features (~20) • ABS_ONLY – Abstractions of unigrams • SS_ONLY – k-grams (either 2 or 3) => (either ~400 or ~8000 features) • FSEL+SS – Feature Selection applied to SS (k-grams) based on Information Gain • ABS+SS - Abstraction applied to SS (k-grams)
Experiments Review • Simplest ABS+SS combination • ABS+SS better that FSEL+SS, ABS alone or BASE features (Acc. & size). • For 1%-2% loss in Accuracy and sometimes even gain ABS+SS reduces model size by 1-3 orders of magnitude over SS alone
Outline • Introduction • What do we mean by Knowledge? • Abstraction SuperStructuring Normal Form • How to derive it from data? • Combining Abstraction + SuperStructuring • The other chapters from the thesis • Conclusions and Contributions
Temporal Boolean Networks (2) [Silvescu and Honavar 2001]
Naïve Bayes k – NB(k) (3) [Silvescu, Andorf, Dobbs and Honavar 2004], [Andorf, Silvescu, Dobbs and Honavar 2004] NBk Naïve Bayes S2 S3 S4 S5 S1 S2 S2 S3 S3 S4 S4 S5 S5 S6 S1 S2 S3 S4 S5 S6 JTT
AVT-Learner (Abstractions) (4) [Kang, Silvescu, Zhang and Honavar, 2004] Odor {m,s,y,f,c,p} {s,y,f,c,p} {s,y,f} {a,l,n} {s,y} {c,p} {a,l} {m} {y} {s} {f} {c} {p} {a} {l} {n}
Factorization Theorem: Pairwise to Holistic Decomposability (7) [Silvescu and Honavar 2006]
Outline • Introduction • What do we mean by Knowledge? • Abstraction SuperStructuring Normal Form • How to derive it from data? • Combining Abstraction + SuperStructuring • The other chapters from the thesis • Conclusions and Contributions
Conclusions • Abstraction + SuperStructruing thesis • ABS, SS, RABS, RSS are enough to produce any Turing equivalent Grammar • And everything else becomes derivative • Experiments • SuperStructuring only (spatial + temporal) • Abstraction only • Abstraction + SuperStructuring
Future Work • Explore additional setups – e.g., model based feature evaluation • Explore additional methods and search mechanisms • Use Algebraic Geometry / Algebraic Topology as a foundation
Contributions (Theory) • Abstraction SuperStructuring Normal Forms (ABS, SS, RABS, RSS) – enough to achieve Turing eq. (answer to the What? question) • Hidden Variables Characterization • Radical Positivism Position • Hume’s Claim (Computationalism) • Factorization theorem for arbitrary functions into Abelian Groups
Contributions (Experimental) • Exploration of SuperStructuring in both the temporal (TBN - 2) and spatial (NB(k) - 3) domains • Abstraction Learning in the Multivariate case (AVTL - 4) • Abstraction and SuperStructuring combination in the Multinomial case (6)