1 / 16

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information

The Computational Discovery of Communicable Knowledge. Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, CA 94304 http://hypatia.stanford.edu/cll/ langley@csli.stanford.edu.

asasia
Download Presentation

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Computational Discovery of Communicable Knowledge Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, CA 94304 http://hypatia.stanford.edu/cll/ langley@csli.stanford.edu Also affiliated with the DaimlerChrysler Research & Technology Center and the Institute for the Study of Learning and Expertise.

  2. The Problem and the Potential Our society is collecting increasing amounts of data in commercial and scientific domains. These include complex spatial/temporal data sets like: traces of traffic behavior from GPS and cell phones prices of stocks and currencies from exchanges measurements of climate and ecosystem variables Computational techniques should let us find relations in these data that are useful for business and society.

  3. Drawbacks of Current Approaches The fields of machine learning and data mining have developed methods to find regularities in data. Despite many successful applications, most techniques: assume attribute-value representations that cannot handle time or space cannot tell interesting discoveries from mundane ones state the discovered knowledge in some opaque form This indicates the need for alternative methods that can address these issues.

  4. Paradigms for Machine Learning decision-tree induction induction of logical rules case-based learning neural networks probabilistic induction

  5. Paradigms for Scientific Discovery taxonomy formation qualitative law discovery equation discovery structural model construction process model formation

  6. Discovering Numeric Laws • Statement of the task: • Given: Quantitative measurements about objects or events in the world. • Find: Numeric relations that hold among variables that describe these items and that predict future behavior. • Historical examples: • Kepler’s three laws of planetary motion • Archimedes’ principle of displacement in water • Black’s law relating specific heat, mass, and temperature • Proust’s and Gay-Lussac’s laws of definite proportions

  7. moon d p d/p d2/p d3/p2 A 5.67 1.77 3.20 18.15 58.15 B 8.67 3.57 2.43 21.04 51.06 C 14.00 7.16 1.96 27.40 53.61 D 24.67 16.69 1.48 36.46 53.89 BACON on Kepler’s Third Law BACON carries out heuristic search through a space of numeric terms, looking for constant values and linear relations. This example shows the system’s progression from primitive variables (distance and period of Jupiter’s moons) to a complex term that has a nearly constant value.

  8. Some Laws Discovered by BACON • Basic numeric relations: • Ideal gas law PV = aNT + bN • Kepler’s third law D3 = [(A - k) / t]2 = j • Coulomb’s law FD2 / Q1Q2 = c • Ohm’s law TD2 / (LI - rI) = r • Relations with intrinsic properties: • Snell’s law of refraction sin I / sin R = n1 /n2 • Archimedes’ law C = V + i • Momentum conservation m1V1 = m2V2 • Black’s specific heat law c1m1T1 + c2m2T2 = (c1m1+ c2m2 )Tf

  9. phosp c2 + phosp • Output: phyt = c1• phyt • – c3• phyt Temporal Laws of Ecological Behavior (Todorovski & Dzeroski, 1997) Input: time phyt zoo phosp temp time 1 phyt 1 zoo 1 phosp 1 temp 1 time 2 phyt 2 zoo 2 phosp 2 temp 2 . . . . . . . . . . time m phyt m zoo m phosp m temp m Input: a context-free grammar of domain constraints

  10. Formulating Structural Models • Statement of the task: • Given: Qualitative or numeric empirical laws that describe observed phenomena. • Find: Explanatory models of these phenomena in terms of component objects and their relations. • Historical examples: • Dalton’s and Avogadro’s molecular models of chemicals • Mendel’s genetic model of inherited traits • Quark models of elementary particles • Structural models of planets, comets, and stars

  11. DALTON on Chemical Reactions Initial state: (reacts in {hydrogen oxygen} out {water})(reacts in {hydrogen nitrogen} out {ammonia})(reacts in {oxygen nitrogen} out {nitrous oxide}) . . . Final state: 2 hydrogen + 1 oxygen  2 water3 hydrogen + 1 nitrogen  2 ammonia2 oxygen + 1 nitrogen  2 nitrous oxidehydrogen  {h h} water  {h h o} oxygen  {h h} ammonia  {h h h n} nitrogen  {h h} nitrous oxide  {n o o} . . . DALTON finds these structural models through a depth-first search process constrained by conservation assumptions.

  12. Constructing Process Models • Statement of the task: • Given: Qualitative or numeric empirical laws that describe temporal phenomena. • Find: Explanatory models of these phenomena in terms of processes among component objects. • Historical examples: • Caloric and kinetic theories of heat phenomena • Reaction pathways in chemistry and nucleosynthesis • Models of continental drift and plate tectonics • Process models of stellar evolution and destruction

  13. ASTRA on Nucleosynthesis Inputs: - quantum properties for elements and isotopes- conservation relations among these properties- an element to be explained (e.g., O or C)- elements to be assumed (e.g., H or He) Outputs: - elementary reactions that obey conservation laws - reaction pathways that explain the element’s evolution ASTRA uses depth-first search to find reaction pathways for:- proton and neutron captures - neutron and deuteron production- generation of helium (He) from hydrogen (H)- generation of carbon (C) and oxygen (O)

  14. Three Pathways for Carbon Synthesis Standard pathway: 4He + 4He 8Be4He + 8Be 12C Alternative pathways: 4He + D 6Li3He + 6Li 9Be 4He + 9Be 12C + n 4He + D 6Li4He + 6Li 10Be 4He + 10Be 12C + D ASTRA generates many pathways novel to astrophysics, some of which have viable reaction rates.

  15. We plan to develop and evaluate discovery methods that: Proposed Research Likely notations for the discovered knowledge include: are designed to process temporal and structured data use techniques from computational scientific discovery describe new knowledge in a communicable form • structural models of relations among entities • process models of change over time • sets of simultaneous differential equations We will apply our methods to domains that benefit from such communicable representations.

  16. Unlike most previous work on data mining and knowledge discovery, our methods will: Benefits of the Approach support discoveries in domains that involve complex spatial, temporal, or relational data use domain knowledge to filter only discoveries that are interesting and novel to the domain user present the new knowledge in some understandable notation that can be communicated among humans Such techniques will improve the way we manipulate and understand complex data.

More Related