1 / 20

Transformational Grammars and PROSITE Patterns

Transformational Grammars and PROSITE Patterns. Roland Miezianko CIS 595 - Bioinformatics Prof. Vucetic. Agenda. Transformational Grammars Definition The Chomsky Hierarchy Finite State Automata FMR-1 Triplet Repeat Region Regular Grammar Example PROSITE Patterns in Regular Grammar Form.

luana
Download Presentation

Transformational Grammars and PROSITE Patterns

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transformational Grammarsand PROSITE Patterns Roland Miezianko CIS 595 - Bioinformatics Prof. Vucetic

  2. Agenda • Transformational Grammars • Definition • The Chomsky Hierarchy • Finite State Automata • FMR-1 Triplet Repeat Region • Regular Grammar Example • PROSITE • Patterns in Regular Grammar Form

  3. Assumptions • Treated biological sequences as one-dimensional strings of independent and uncorrelated symbols. • Need to address interaction among base pairs to understand secondary structures.

  4. Secondary Structures • The 3-D folding of proteins and nucleic acids involves extensive physical interactions between residues that are not adjacent in primary sequence. [1] • Require a model for secondary structure that reflect the interaction among base pairs.

  5. Modeling Strings • General theories for modeling strings of symbols has been developed by computational linguists • Chomsky in 1956, 1959 • Interested in how a brain or computer program could algorithmically determine whether a sentence was grammatical or not

  6. Transformational Grammars • Transformational Grammars consist of: • Symbols • Abstract Nonterminal Symbols • Terminal Symbols • Rewriting Rules (Productions) • A --> B

  7. Transformational Grammars, Example Example Grammar Two-letter terminal alphabet: {a, b} Single nonterminal letter: S Three Productions: S->aS S->bS S->e (e=special blank terminal symbol) Example derivation of our simple grammar: S->aS->abS->abbS->abb

  8. Chomsky Hierarchy • Four types of restrictions on grammar’s productions resulted on four classes of grammars. • Regular Grammars • Context-Free Grammars • Context-Sensitive Grammars • Unrestricted Grammars

  9. Chomsky Hierarchy unrestricted context-sensitive context-free regular

  10. Automata • Each grammar has a corresponding abstract computational device called: automaton GrammarParsing Automaton Regular Finite State Context-Free Push-Down Context-Sensitive Linear Bounded Unrestricted Turing Machine

  11. FRM-1 TripletRepeat Region • FRM-1 gene sequence contains CGG which is repeated number of times • Number of triplets is highly variable between individuals • Increased copy number is associated with a genetic disease

  12. FRM-1 TripletRepeat Region • FSA will match any string from the “language” that contains the strings: GCG CTG GCG CGG CTG GCG CGG CGG CTG GCG CGG CGG CGG CGG … CTG

  13. FRM-1 TripletRepeat Region

  14. FRM-1 TripletRepeat Region Regular Grammar for our Finite State Automaton finds any number of copies of CGG

  15. PROSITE Patterns • PROSITE database is an example of a biological application of regular grammars • Unlike methods which assign scores to alignments, PROSITE patterns either match a sequence or do not.

  16. PROSITE Patterns • Consists of a string of pattern elements separated by dashes and terminated by a period • Pattern Element – single letter • [ ] - any one letter • { } – anything but enclosed letters • X – any residue can occur • X(y) – any letter of length y

  17. PROSITE Patterns RNP-1 Motif [RK]-G-{EDRKHPCG}-[AGSCI]-[FY]-[LIVA]-x-[FYM].

  18. Conclusion • Transformational grammars are useful in developing acceptors of different length sequences and for matching specific multi-sequence regions. • Higher order grammars in the Chomsky hierarchy are more difficult to program and apply

  19. References [1] Durbin, R. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. University of Cambridge Press, 1998. [2] Gibson, G. A Primer of Genome Science. Sinauer Associates, Inc. Publishers, 2002. [3] Mount, D. Bioinformatics: Sequence and Genome Analysis. Cold Spring Harbor Laboratory Press, 2001. [4] PROSITE Database http://us.expasy.org/prosite/

  20. Transformational Grammarsand PROSITE Patterns Questions And Answers

More Related