1 / 55

Coarse-Graining, Symbolic Description, and Complexity

Explore the concept of coarse-graining and symbolic description in complex systems. This lecture discusses the use of symbols, language, grammar, and automaton to study and understand complex phenomena.

gbarros
Download Presentation

Coarse-Graining, Symbolic Description, and Complexity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Coarse-Graining, SymbolicDescription, and Complexity Bailin Hao Institute of Theoretical Physics, Academia Sinica, Beijing T-Life Research Center, Fudan University, Shanghai The Santa Fe Institute, New Mexico http://tlife.fudan.edu.cn/ http://www.itp.ac.cn/~hao/ CSSS2007 Beijing

  2. There are complex systems and complex behavior in natural and social phenomena. • Complexity goes with specificity. There is no universal measure of complexity. • One must specify the phenomenon under consideration and set a framework for study. • This lecture will describe one such framework. (Dave Feldman has provided most of the prerequisites.)

  3. Start from an observation u d c s b t (Quarks with charge, mass, flavor, charm, …) p n e (Particles with charge, mass, spin, magnetic momentum, …) H C N O P … (Atoms with atomic number, ion radius, valence, affinity, …) H2O NO CO2… (Molecules with molecular weight, polarity, color, …) a c g t (Nucleotides with strong or weak coupling) A D E F G H … W Y V (Amino acids with different physico-chemical properties) BRCA1 PDGF (Genes, proteins, “words” taken as single symbols appear in pathways and networks) … … … … … … …

  4. (Almost) all symbols in scienceare embodiment ofCoarse-Graining Coarse-graining may lead to rigorous results Geoffrey West: had Galileo be equipped with our high precision measuring instruments he would not be able to discover the law of free falling body and would have to write a 42-volume Treatise on Falling Bodies.

  5. Coarse-Grained Description of Nature↓Use of Symbols↓Symbolic Sequences↓Language, Grammar, Automaton

  6. A Reminder: Theorem 3 in Shannon’s Famous 1948 Paper • Theorem 3: Given any ε>0 andδ>0, we can find an N0 such that the sequences of any length N>=N0 fall into two classes: • A set whose total probability is less than ε. • The remainder, all of whose members have probabilities satisfying the inequality

  7. Intuitive Explanation of the Theorem • There are 2N sequences of length N over the alphabet (0, 1) • Roughly speaking these sequences are divided into two groups when N is very large • A big group of typical sequences • A small group of atypical sequences: 0N, 1N, (01)N, (10)N, … and more complex ones which must be characterized almost individually

  8. Our Starting Points: • Complexity goes with specificity. • Therefore, one has to look at real data. • These data are often noisy, incomplete and of low Signal/Noise ratio. This is especially true for biological data. • Therefore, statistical methods are must, but one should go beyond statistics. • Visualization with a certain degree of coarse-graining is crucial for highlighting the “regularities” immersed in huge amount of data.

  9. Symbolic sequences naturallylead toLangauge and grammar

  10. Language and Grammatical Complexity Alphabet  Example 1.  = {a, c, g, t} Example 2.  = {A, C, D … W, Y} Example 3.  = {a, … z, A, … Z, +, –, …} All possible strings made of symbols from the alphabet plus an empty string ε→ * Any subset of * is called a language over the alphabet  Grammar= {Alphabet, Initial symbols, Production Rules}

  11. Classification of Formal Languages Chomsky Hierarchy Sequential production rules Lindenmayer Systems Parallel production rules

  12. Generative Grammar S NP VP VP V NP NP (Art) Adj* N S if S then S S either S or S N boy | girl | scientist | … V sees | believes | loves | eats | … Adj young | good | beautiful | … Art a | one | the S Sentence NP Noun Phrase VP  Verb Phrase Adj  Adjective Art Article Non-Terminal and Terminal Symbols

  13. Chomsky Hierarchy of Formal Languages

  14. a b (i) (ii)  (a, R) = b A Finite State Automaton (FA) A transfer function

  15. FA: Finite State Automata • Deterministic FA • Non-Deterministic FA • Equivalence of DFA and NDFA: subset construction • Minimal DFA • Myhill-Nerode theorem (1958): number of nodes in minDFA

  16. A Pushdown Automaton Pushdown list Stack First In Last Out (FILO)

  17. A Turing MachineAlan M. Turing (1912-1954) FA +  R/W tape Church-Turing Thesis (1936): Any effective (mechanical) computation can be carried out by a Turing machine

  18. Example: {ai b ici | i>0} CSL Terminals = {a, b, c} Non-terminal = {A, B} Sequential rules: B aBAc | abc bA bb cA Ac B abc B aBAc aabcAc aabAcc→aabbcc B aBAc aaBAcAc aaBAAcc aaabcAAcc aaabAcAcc aaabbAccc

  19. Classification of Formal Languages Chomsky Hierarchy Sequential production rules Lindenmayer Systems Parallel production rules

  20. Development of Anabaena catenula (串珠藻项圈藻属) br ar ar albr bl al al blar br bl ar al albr blar Alphabet: S = {ar, al, br, bl} Production rules: Initial symbol (axiom) = ar Grammar: G = (S, P, ) Language: L (G)  S* P =

  21. Lindenmayer Systems Parallel production rules. Finer classification D0L –Deterministic, no interaction, i.e., context-free 0L – non-deterministic, no interaction IL – non-deterministic, with Interaction, i.e., context sensitive T0L – with Table of production rules TIL – E0L – Extended to non-terminal symbols ET0L – EIL REL of Chomsky

  22. CSL CFL RGL FIN DOL RGL Regular CFL Context-Free CSL Context-Sensitive REL Recursively Enumerable REL

  23. 0:REL EIL 1:CSL IND ET0L IL E0L Chomsky Lindenmayer Indexed 2:CFL T0L 3:RGL 0L D0L

  24. Example a la Lindenmayer L = {aibici | i > 0} CSL G = (S, T, )  = abc S = {a, b, c} T = {t1, t2} T1= {a aa, b bb, c cc} T2 = {a , b , c } T0L

  25. Dyck language: A language of nested parentheses • Many but finite types of parentheses • Matched parentheses • Finite depth of nesting • Context-free language (CFL) • Tree structures, list structures, RNA secondary structures (without pseudoknots), etc.

  26. Factorizable Languages • Symbolic dynamics leads to factorizable languages • A complete genome defines a factorizable langauge • An amino acid sequence with unique reconstruction (at certain K) defines a factorizable language • More on factorizable language in next lecture

  27. Coarse-Grained Dynamics ↓ Symbolic Dynamics

  28. Graphic iteration of a map

  29. Coarse-Graining in Dynamics • Phase space → L, R • Numerical orbit → symbolic orbit • Many to one correspondence • Possibility for classification

  30. Basic properties • Natural order on the interval: L < C < R • Monotonicity: L and fL↑; R and fR ↓ • Parrity: L +; R – • L preserves order, R reverses order • Continuity: L←C→R • L→L(y)≡fL-1(y) • R→R(y)≡fR-1(y)

  31. Infinitely many numerical orbits Only two symbolic orbits: L∞ and RL∞ Simple dynamicsSimple language: 2 word types only

  32. Langauges in unimodal map: 1991 • Feigenbaum attractor corresponds to a CSL; • Are there other CSL and CFL? • Periodic and eventual periodic orbits are RGL • Are there other RGL?

  33. Periodic orbit (RLRRC)∞ andFinite State Automaton x0=CRLRRC… x1=RLRRC… x2=LRRC… x3=RRC… x4=RC…

  34. Transformation of subintervalsUnder (RLRRC)∞ L: a → c+d R: b → d R: c → b+c R: d → a

  35. Transfer Matrix and Transfer Function 0 0 1 1 0 0 0 1 0 1 1 0 1 0 0 0 States: a, b, c, d Input: R, L

  36. Nondeterministic Finite State Automaton for (RLRRC)∞

  37. Subset construction

  38. Deterministic Finite State Automaton for (RLRRC)∞

  39. Are there other RGLs inUnimodal maps? Theorem (Xie, 1993) In the dynamical languages of unimodal maps the class of RGLs contains only periodic and eventually periodic sequences.

  40. Fibonacci sequences • Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8, 13,… • F0=0, F1=1, Fn=Fn-1+Fn-2 • Periodic orbits with period Fn n=0,1,2,3,… • How about n→∞? • There are many different Fibonacci sequences in the unimodal map

  41. How to go beyond RGLs? Block concatenation: B2n=b2(n-1)b2n-1 B2n+1=b2nb2n-1 • (a) b0=L, b1=RR • (b) b0=R, b1=LR • (c) b0=L, b1=RL • (a) b0=R, b1=LL

  42. How about (bn)∞? • Finite n: must be RGL • Infinite n? The closure at n→∞ may be non-RGL • Indeed, it is non-RGL • Is it CFL or CSL? • How to comprehend infinite “periodic” orbits? Transfer matrices come to our help

  43. Transfer matrix for case (a)

  44. Xie Huimin and a PhD student proved 50 lemmas in 40 daysandThe last lemma says: case(a) corresponds to CSL.

  45. Transfer matrix for case (b)

  46. Transfer matrix for case (c)

  47. Transfer matrix for case (d)

  48. It is easier to prove that cases (b), (c), and (d) all correspond to CSLs.

  49. Conjecture: there is no CFL inUnimodal maps (Xie, 1996) An open conjecture for 11 years

  50. Dynamical languages inUnimodal maps 1991 1999

More Related