1 / 35

Motif discovery

Tutorial 5. Motif discovery. Agenda. Motif discovery MEME Creates motif PSSM de-novo (unknown motif) MAST Searches for a PSSM in a DB TOMTOM Searches for a PSSM in motif DBs. Cool story of the day: How NOT to be a bioinformatician. Motif – definition. Motif

sidney
Download Presentation

Motif discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tutorial 5 Motif discovery

  2. Agenda • Motif discovery • MEME • Creates motif PSSM de-novo (unknown motif) • MAST • Searches for a PSSM in a DB • TOMTOM • Searches for a PSSM in motif DBs Cool story of the day: How NOT to be a bioinformatician

  3. Motif – definition Motif a widespread pattern with a biological significance. Sequence motif PTB (RNA binding protein) UCUU CAP (DNA binding protein) TGTGAXXXXXXTCACAXT

  4. Sequence motif – definition Motif a nucleotide or amino-acid sequence pattern that is widespread and has a biological significance PSSM - position-specific scoring matrix ..YDEEGGDAEE.. ..YDEEGGDAEE.. ..YGEEGADYED.. ..YDEEGADYEE.. ..YNDEGDDYEE.. ..YHDEGAADEE..

  5. Can we find motifs using multiple sequence alignment (MSA)? YES! NO Local multiple sequence alignment is a hard problem to solve

  6. Motif search: from de-novo motifs to motif annotation gapped motifs Large DNA data http://meme.sdsc.edu/

  7. MEME

  8. MEME – Multiple EM* for Motif finding http://meme.sdsc.edu/ • Motif discovery from unaligned sequences - genomic or protein sequences • Flexible model of motif presence (Motif can be absent in some sequences or appear several times in one sequence) *Expectation-maximization

  9. How many times in each sequence? MEME - Input Input file (fasta file) Range of motif lengths How many motifs? How many sites?

  10. MEME - Output Motif e-value

  11. MEME – Sequence logo Motif e-value A graphical representation of the sequence motif Motif length Number of appearnces

  12. MEME – Sequence logo High information content = High confidence The relative sizes of the letters indicates their frequency in the sequences The total height of the letters depicts the information content of the position, in bits of information.

  13. MEME – Sequence logo Multilevel Consensus

  14. Patterns can be presented as regular expressions [AG]-x-V-x(2)-{YW} [] - Either residue x - Any residue x(2) - Any residue in the next 2 positions {} - Any residue except these Examples: AYVACM, GGVGAA

  15. MEME – motif alignment Sequence names Position in sequence Strength of match Motif within sequence

  16. MEME – motif locations Sequence names Motif location in the input sequence Overall strength of motif matches

  17. What can we do with motifs? • MAST - Search for them in non annotated sequence databases (protein and DNA). • TOMTOM - Find the protein • which binds the DNA motifs.

  18. MAST

  19. MAST http://meme.sdsc.edu/meme4_4_0/cgi-bin/mast.cgi • Searches for motifs (one or more) in sequence databases: • Like BLAST but motifs for input • Similar to iterations of PSI-BLAST • Profile defines strength of match • Multiple motif matches per sequence • MEME uses MAST to summarize results: • Each MEME result is accompanied by the MAST result for searching the discovered motifs on the given sequences.

  20. MAST - Input Database Input file (motifs)

  21. If you wish to use motifs discovered by MEME

  22. MAST - Output Input motifs Presence of the motifs in a given database

  23. MAST – Output (another example, global view)

  24. MAST – Output (another example, global view)

  25. TOMTOM

  26. TOMTOM http://meme.sdsc.edu/meme/doc/tomtom.html • Searches one or more query DNA motifs against one or more databases of target motifs, and reports for each query a list of target motifs, ranked by p-value. • The output contains results for each query, in the order that the queries appear in the input file.

  27. TOMTOM - Input Input motif Background frequencies Database

  28. TOMTOM - Output Input motif Matching motifs

  29. TOMTOM – Output Wrong input (RNA sequence of RNA binding protein NOVA1) “OK” results

  30. MAST vs. TOMTOM

  31. Cool Story of the day How NOT to be a bioinformatician

More Related