80 likes | 585 Views
Protein Sequence Classification Using Neighbor-Joining Method. Bo Liu. Overview. Given: A group of sequences, they have somewhat similarity between each other and same protein function. Input: One unknown function sequence Output: If this sequence belongs to this protein cluster. .
E N D
Protein Sequence Classification Using Neighbor-Joining Method Bo Liu
Overview • Given: A group of sequences, they have somewhat similarity between each other and same protein function. • Input: One unknown function sequence • Output: If this sequence belongs to this protein cluster.
Representation of Sequences Group • Distance Matrix • Matrix Calculation • Pair-Wise Alignment • Multiple Sequence Alignment • Alignment-Free: Relative Lempel-Ziv Complexity Otu et al. Bioinformatics, 2003
Correlation of Input Sequence with Group • NJ method • Smallest Sum of Branch Lengths Saitou et al. Mol. Biol. Evol., 1987
NJ Method • Leaf Length • Distance to Node • New Distance Matrix Studier et al. Mol. Biol. Evol., 1988
Classification Criteria • Node with longest leaf length. • Evolve too fast • Last node joined into the tree. • Cost the most to join the tree
Running Time • Preprocessing: • Distance Matrix Calculation: O(n2l2) • Query Sequence Classification: • Distance Calculation: O(nl2) • NJ Construction: O(n3)