1 / 28

Explorations of Multidimensional Sequence Space

Explorations of Multidimensional Sequence Space. one symbol -> 1D. coordinate of dimension = pattern length. Two symbols -> Dimension = length of pattern. length 1 = 1D: . Two symbols -> Dimension = length of pattern. length 2 = 2D: . dimensions correspond to position

xerxes
Download Presentation

Explorations of Multidimensional Sequence Space

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Explorations ofMultidimensional Sequence Space

  2. one symbol -> 1D coordinate of dimension = pattern length

  3. Two symbols -> Dimension = length of pattern length 1 = 1D:

  4. Two symbols -> Dimension = length of pattern length 2 = 2D: dimensions correspond to position For each dimension two possibiities Note:Here is a possible bifurcation: a larger alphabet could be represented as more choices along the axis of position!

  5. Two symbols -> Dimension = length of pattern length 3 = 3D:

  6. Two symbols -> Dimension = length of pattern length 4 = 4D: aka Hypercube

  7. Two symbols -> Dimension = length of pattern

  8. Three Symbols (another solution is to use more values for each dimension)

  9. Four Symbols: I.e.: with an alphabet of 4, we have a hypercube (4D) already with a pattern size of 2, provided we stick to a binary pattern in each dimension.

  10. hypercubes at 2 and 4 alphabets 2 character alphabet, pattern size 4 4 character alphabet, pattern size 2

  11. Three Symbols Alphabet suggests fractal representation

  12. 3 fractal enlarge fill in outer pattern repeats inner pattern = self similar = fractal

  13. 3 character alphapet3 pattern fractal

  14. 3 character alphapet4 pattern fractal Conjecture: For n -> infinity, the fractal midght fill a 2D triangle Note: check Mandelbrot

  15. Same for 4 character alphabet 1 position 2 positions 3 positions

  16. 4 character alphabet continued(with cheating I didn’t actually add beads) 4 positions

  17. 4 character alphabet continued(with cheating I didn’t actually add beads) 5 positions

  18. 4 character alphabet continued(with cheating I didn’t actually add beads) 6 positions

  19. 4 character alphabet continued(with cheating I didn’t actually add beads) 7 positions

  20. Animated GIf 1-12 positions

  21. Protein Space in JalView

  22. Alignment of V F A ATPase ATP binding SU(catalytic and non-catalytic SU)

  23. UPGMA tree of V F A ATPase ATP binding SU with line dropped to partition (and colour) the 4 SU types (VA cat and non cat, F cat and non cat). Note that details of the tree $%#&@.

  24. PCA analysis of V F A ATPase ATP binding SU using colours from the UPGMA tree

  25. Same PCA analysis of V F A ATPase ATP binding SU using colours from the UPGMA tree, but turned slightly. (Giardia A SU selected in grey.)

  26. Same PCA analysis of V F A ATPase ATP binding SU Using colours from the UPGMA tree, but replacing the 1st with the 5th axis. (Eukaryotic A SU selected in grey.)

  27. Same PCA analysis of V F A ATPase ATP binding SU Using colours from the UPGMA tree, but replacing the 1st with the 6th axis. (Eukaryotic B SU selected in grey - forgot rice.)

  28. Problems • Jalview’s approach requires an alignment. • Solution: Use pattern absence / presence as coordinate • Which patterns? • GBLOCKS (new additions use PSSMs) • CDD PSSM profiles • It would be nice to stick to small words. • One could screen for words/motifs/PSSMs that have a good power of resolution: • PCA with all, choose only the ones that contribute to the main axis • probably better to do data bank search and find how often it is present. One could generate random motifs (or all possible motifs) and check them out (Criterion needs work). • Empirical orthogonality • Exhaustive vs random • How to judge discriminatory power (maybe 5% significance value) • Present absence - optimal discriminatory power?

More Related