150 likes | 254 Views
WebLogo Plus. Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI
E N D
WebLogo Plus Sagar Gaikwad and MohitAgrawal
LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL • LTMT.-RGDIGNYLGLTVETISR----------- • LTMT.-RGDIGNYLGLTVETISR----------- • LTMT.-RGDIGNYLGLTVETISR----------- • LTMT.-RGDIGNYLGLTVETISR----------- • LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI • LTMT.-RGDIGNYLGLTVETISRLLGRFQKSGLI • LTMT.-RGDIGNYLGLTVETISRLLGRFQKSGML • LTMT.-RGDIGNYLGLTVETISRLLGRFQKSGML • LTMT.-RGDIGNYLGLTVETISRLLGRFQKSGML • LTMT.-RGDIGNYLGLTVETISRLLGRFQKSGML • LTMT.-RGDIGNYLGLTIETISRLLGRFQKSGMI • LTMT.-RGDIGNYLGLTIETISRLLGRFQKSGMI • LTMT.-RGDIGNYLGLTIETISRLLGRFQKSGMI • LTMT.-RGDIGNYLGLTVETISRLL • LPLT.-RADISDFLGLTNETVSRQLTRLRADGVI • LPLT.-RADIADFLGLTIETVSRQLTRLRTDGLI • LPLS.-RAEIADFLGLTIETVSRKLTKLRKSGVI • LPLS.-RAEIADFLGLTIETVSRQLTRLRKEGVI • LPLS.-RAEIADFLGLTIETVSRQMTRLRKWGVI • LPLS.-RAEIADFLGLTIETVSRQMTRLRKSGVI • LPLS.-RAEIADFLGLTIETVSRQMTRLRKIGVI
Background - WebLogo • A UC – Berkley Project • What is Sequence Logo • Generates Sequence logos. • Input from Manual/FASTA/CLUSTAL format Reference : http://weblogo.berkeley.edu/
WebLogo • Different residues at the same position are scaled according to their frequency. • Where Rseq – sequence conservation at a particular position in alignment • n – Symbol (like A G T C for DNA) • N – number of distinct symbols. 4 for DNA /RNA – 20 for Protein sequences • Smax – Maximum possible entropy • Sobs– Entropy of observed symbol distribution
Advantages • can rapidly reveal significant features of the alignment otherwise difficult to perceive • Interpret the sequence-specific binding of the protein CAP to its DNA recognition site • Works for DNA/RNA/Protein logos • can illuminate patterns of amino acid conservation that are often of structural or functional importance • Open source
Applications • for displaying TFBS • Motif discovery • Sequence Scanning
Drawbacks of WebLogo • Correlations between different positions of the alignment • Not interactive • Hard to spot infrequent characters
What is Nested WebLogo • Transcription factor have positional dependency • What is positional dependency • Nesting of WebLogo’s based on positional dependencies
Example • AGTCTACC • AGTCCACG • ATGCTACG • TAGTTTCG • ATGCTAGG • ATGTAACT • AGTCTACC • AGTCCACG • ATGCTACG • TAGTTTCG • ATGCTAGG • ATGTAACT Wild card: T.* Position Set 2,4
Heat Map • What is heat map Advantages • Improves Readability
UI Flow Fasta File Reader Web-logo Drawer Web-Logo Creator Graphics Display Position Dependency Reader
Out contribution • No open source java implementation available for WebLogo • Implementation of graphical display of web logo in Java • Interactive – Zoom in and Zoom out feature for clear visibility • Heat Maps • Nested Logos • 3D Heat Maps*
References • Crooks GE, Hon G, Chandonia JM, Brenner SE WebLogo: A sequence logo generator, Genome Research, 14:1188-1190, (2004) [Full Text ] • Schneider TD, Stephens RM. 1990. Sequence Logos: A New Way to Display Consensus Sequences.Nucleic Acids Res.18:6097-6100 • www.weblogo.berkley.edu • Efficient representation and P-value computation for high-order Markov motifs Paulo G. S. da Fonseca1, Katia S. Guimarães1andMarie-France Sagot2 • Bayesian Models and Markov Chain Monte Carlo Methods for Protein Motifs with the Secondary Characteristics Authors : Jun Xieand Nak-Kyeong Kim