1 / 9

Phylogenetic Tree Construction

Phylogenetic Tree Construction. Mark Eldridge Andrew Larsen Michael Lollis Thomas Marley Michael Smith. Intro page (overview of talk):. Tom – Intro to the topic. Andrew -- Reading in objects from a FASTA file and MUSCLE compare. Mike S. -- Getting the Matrix

dreama
Download Presentation

Phylogenetic Tree Construction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Phylogenetic Tree Construction Mark Eldridge Andrew Larsen Michael Lollis Thomas Marley Michael Smith

  2. Intro page (overview of talk): • Tom – Intro to the topic. • Andrew -- Reading in objects from a FASTA file and MUSCLE compare. • Mike S. -- Getting the Matrix • Mark -- Determining the Matrix • Mike L. -- Building the Tree • Conclusion -- some examples of our program in action. • Q & A

  3. Turn this: We set out to... Label     A     B     C     D     E     F Sequence GATTCCAG GATTCTGG GGTTCCGG  GGTTTCGG GGCTCCGA GGCCCCGG into this:

  4. How? UPGMA: Unweighted Pair Group Method with Arithmetic Mean • Construct distance matrix (pairwise between groups) • Merge two closest groups • Repeat steps 1 and 2 until only two groups remain • Note: distances for merged groups are calculated by taking the arithmetic mean of distances for all members

  5. FASTA file and MUSCLE compare • Format,standards, and lots of data... • We figured out how to read in "SeqIO objects" • Now that we have the objects what do we do with them? • MUSCLE power.  • So now what do we have? • A pretty ideal way to access a semi-large dataset. • We normalized the data for later functions and computing.

  6. Getting the Matrix Have object with an ID to identify the gene, and the sequence Muscle has already aligned the sequences to be the same length Compare function does a character-to-character compare of similarities Using NumPy, we create a matrix and filled the matrix with the first run of comparisons It was then in a format for successive similarity calls

  7. Recursive Function to Determine Next Matrix A A B B C C D D E E Initial Formula Weighted Formula A A BDC BDC E E A BD C E A -1 -1 -1 -1 -1 A -1 -1 -1 -1 -1 A -1 -1 -1 -1 B 4 -1 -1 -1 -1 B 4 -1 -1 -1 -1 A A -1 -1 -1 -1 -1 -1 BD 3 -1 -1 -1 C 4 3 -1 -1 -1 C 4 3 -1 -1 -1 BDC 3.5 -1 -1 BDC 3.33 -1 -1 C 4 2.5 -1 -1 D 2 1 2 -1 -1 D 2 1 2 -1 -1 E E 3 3 4.25 4 -1 -1 E 3 3.5 5 -1 E 3 4 5 3 -1 E 3 4 5 3 -1 First Matrix First List 0: ‘A’ 1: ‘B’ 2: ‘C’3: ‘D’4: ‘E’ Min = 1Min = (3, 1) -> (B, D) For new matrix, append D onto B. BD to A = BD to C = BD to E = Min = 2.5Min = (2, 1) -> (BD, C) Second Matrix Second List 0: ‘A’ 1: ‘(B, D)’ 2: ‘C’3: ‘E’

  8. What is Dendropy and why did we use it? • Dendropy is a library of functions for python that allow the user to create phylogenetic tree structures and display them. • Phylo vs. Dendropy • Phylo was "too powerful" and didn't allow for much "under the hood" code. • Dendropy provided more basic functionality. How did we build the tree? • Build upon a 'newick' formatted string each time Mark's algorithm recuresed. • Draw an ASCII representation of the phylogenetic tree.

  9. Conclusion/Q & A

More Related